Controversies and uncertainty persist in prostate cancer grading.
To update grading recommendations.
Critical review of the literature along with pathology and clinician surveys.
Percent Gleason pattern 4 (%GP4) is as follows: (1) report %GP4 in needle biopsy with Grade Groups (GrGp) 2 and 3, and in needle biopsy on other parts (jars) of lower grade in cases with at least 1 part showing Gleason score (GS) 4 + 4 = 8; and (2) report %GP4: less than 5% or less than 10% and 10% increments thereafter. Tertiary grade patterns are as follows: (1) replace “tertiary grade pattern” in radical prostatectomy (RP) with “minor tertiary pattern 5 (TP5),” and only use in RP with GrGp 2 or 3 with less than 5% Gleason pattern 5; and (2) minor TP5 is noted along with the GS, with the GrGp based on the GS. Global score and magnetic resonance imaging (MRI)-targeted biopsies are as follows: (1) when multiple undesignated cores are taken from a single MRI-targeted lesion, an overall grade for that lesion is given as if all the involved cores were one long core; and (2) if providing a global score, when different scores are found in the standard and the MRI-targeted biopsy, give a single global score (factoring both the systematic standard and the MRI-targeted positive cores). Grade Groups are as follows: (1) Grade Groups (GrGp) is the terminology adopted by major world organizations; and (2) retain GS 3 + 5 = 8 in GrGp 4. Cribriform carcinoma is as follows: (1) report the presence or absence of cribriform glands in biopsy and RP with Gleason pattern 4 carcinoma. Intraductal carcinoma (IDC-P) is as follows: (1) report IDC-P in biopsy and RP; (2) use criteria based on dense cribriform glands (>50% of the gland is composed of epithelium relative to luminal spaces) and/or solid nests and/or marked pleomorphism/necrosis; (3) it is not necessary to perform basal cell immunostains on biopsy and RP to identify IDC-P if the results would not change the overall (highest) GS/GrGp part per case; (4) do not include IDC-P in determining the final GS/GrGp on biopsy and/or RP; and (5) “atypical intraductal proliferation (AIP)” is preferred for an intraductal proliferation of prostatic secretory cells which shows a greater degree of architectural complexity and/or cytological atypia than typical high-grade prostatic intraepithelial neoplasia, yet falling short of the strict diagnostic threshold for IDC-P. Molecular testing is as follows: (1) Ki67 is not ready for routine clinical use; (2) additional studies of active surveillance cohorts are needed to establish the utility of PTEN in this setting; and (3) dedicated studies of RNA-based assays in active surveillance populations are needed to substantiate the utility of these expensive tests in this setting. Artificial intelligence and novel grading schema are as follows: (1) incorporating reactive stromal grade, percent GP4, minor tertiary GP5, and cribriform/intraductal carcinoma are not ready for adoption in current practice.
Originally proposed by Donald Gleason in 1966, the Gleason grading system has evolved with several important modifications in the past 2 decades.1–5 Cumulatively, Gleason pattern 3 has been narrowed to a more uniform architectural definition and the morphologic features grouped as Gleason pattern 4 have expanded. In parallel, the introduction of a prognostic Grade Group system and its subsequent corroboration across multiple cohorts has been a significant patient-centric advance in prostate cancer grading.6–11 These modifications (Table 1), based on experiences of tens of thousands of cases from large institutional cohorts, the literature, expert opinions, deliberation, and consensus, have been essential to keep the grading paradigm clinically relevant and a foremost prognostic factor even in the era of “omics” medicine.12–16
The Genitourinary Pathology Society (GUPS) was formed in 2018 as a global organization with a vision to advance the care of patients with urologic diseases through enhancing best practices, research, and education in the subspecialty of urologic pathology. Its membership is composed of academic and community pathologists, and residents and fellows with interest in urologic pathology. It is also designed to be inclusive of clinicians, including urologists, medical oncologists, radiation oncologists, and translational and basic scientists. At the March 2019 United States and Canadian Academy of Pathology meeting, several members of the leadership met to discuss how the society should approach the important issue of unresolved and emerging issues in prostate cancer grading. It was decided that a review of available evidence regarding expert-derived topics by diverse members of its international membership, coupled with an understanding of current practice patterns gleaned from membership and clinician survey to questionnaires on grading, would form the basis of GUPS recommendations on these prostate cancer grading issues.
Two GUPS surveys on practice patterns relating to grading were formulated and sent to both GUPS pathology members as well as to several clinician groups and organizations with broad international representation. Selected results from the clinician survey that were completed by August 23, 2019 are presented herein. The results of the complete clinician survey will be the subject of a separate detailed manuscript. The GUPS pathology survey results included in this paper are based on participation of 230 pathologists who took part in the survey with 211 completing the survey. A summary of the participant demographics for the GUPS pathologist survey is shown in Table 2. Five hundred fifty-one (551) clinicians took part in the clinical survey with 371 completing the survey. Eighty-four percent (463 of 551) of the clinicians were urologists with 7.3% (40 of 551) medical oncologists and 8.7% (48 of 551) radiation oncologists. Most of the clinicians were from the United States (225 of 551; 40.8%) or Europe (226 of 551; 41.0%). The results from both surveys were integrated along with a critical review of the literature to formulate the GUPS recommendations in this position paper on prostate cancer grading.
Members of each working group are listed in the Appendix.
WORKING GROUP 1: PERCENT GLEASON PATTERN 4
A summary of recommendations on Percent Pattern 4 is seen in Table 3.
Reporting Quantity of Gleason Pattern 4 in Needle Biopsy With Highest Gleason Score 3 + 4 = 7 (Grade Group 2)
Building upon the Grade Group system, one of the significant recommendations that emerged from the 2014 International Society of Urological Pathology (ISUP) conference that was incorporated into the World Health Organization (WHO) Classification of Genitourinary Tumors, was reporting of percent Gleason pattern 4 in Grade Groups 2 and 3.17,18 Although those discussions included both needle biopsy and radical prostatectomy specimens, reporting bodies, including the College of American Pathologists, have only required its reporting in needle biopsies with Gleason score 7 (Grade Groups 2–3) cancers. For patients with highest Grade, Group 2 on needle biopsy, this requirement is predicated upon a clinical rationale. With increased utilization of active surveillance as a management strategy for many patients with very low-risk/low-risk prostate cancer, the National Comprehensive Cancer Network (NCCN) guidelines now also consider active surveillance for select favorable intermediate risk patients, which includes low-volume Grade Group 2 disease, depending on life expectancy and other clinical/radiologic factors.19 Hence, routine reporting of percent Gleason pattern 4 in cases with Grade Group 2 may determine active surveillance eligibility.
Emerging evidence supports this notion, with studies by Huang et al20 and Kir et al21 demonstrating similar rates of radical prostatectomy adverse pathology for patients with needle biopsy Grade Group 1 versus Grade Group 2 with a 5% or less Gleason pattern 4. Cole et al22 demonstrated that increasing overall percent Gleason pattern 4 on needle biopsy correlates with increasing rate of radical prostatectomy adverse pathology. Perlis et al23 recommends that needle biopsy assessment of percent Gleason pattern 4 be used in combination with the patient's age, prostate specific antigen (PSA) levels, and disease volume to predict pathologic stage T3 disease. Most recently, Dean et al24 showed that quantification of Gleason pattern 4 in needle biopsy Grade Group 2 patients adds predictive value for adverse radical prostatectomy pathology and benefit to clinical decision making for active surveillance.
These collective studies represent accumulating evidence supporting the recommendation to record percent Gleason pattern 4 in needle biopsy with Grade Group 2. In parallel, this reporting pattern has become standard practice for many surgical pathologists, as reflected in the 90% (202 of 225) of GUPS pathologist survey respondents who stated they routinely report percent Gleason pattern 4 in such patients. A large majority (435 of 547; 79.5%) of clinicians in the GUPS clinical survey also reported that at some time they have used the quantity of pattern 4 on needle biopsy in clinical decision-making. Looking to the future, it will be important to study quantification of Gleason pattern 4 in populations of Grade Group 2 patients being managed by active surveillance to determine its true clinical impact on progression to active therapy.
Reporting Quantity of Gleason Pattern 4 in Needle Biopsy Specimens with Highest Gleason Score 4 + 3 = 7 (Grade Group 3)
Pathologists have long separated prostate cancer with Gleason score 7 into 3 + 4 = 7 and 4 + 3 = 7 disease because of its prognostic differences.25 This distinction has been further highlighted in the era of Grade Groups, which designate Gleason score 3 + 4 = 7 as Grade Group 2 and Gleason score 4 + 3 = 7 as Grade Group 3.
Sampling error is a well-known pitfall of prostate needle biopsy and consequently, it has been suggested reporting percent Gleason pattern 4 may further stratify cases with highest Grade Group 3. Knowing whether the Gleason pattern 4 component of Gleason score 4 + 3 = 7 on biopsy is 60% versus 90% may be beneficial for patient counseling and treatment decisions. Grade Group 3 with 60% Gleason pattern 4 on biopsy has a relatively higher likelihood of reflecting a Grade Group 2 or Grade Group 3 tumor nodule, compared with a Grade Group 3 with 90% Gleason pattern 4 on biopsy, which has a relatively increased chance of sampling a Grade Group 3 or higher grade tumor. Additionally, the interobserver variability for some Gleason pattern 4 cancers has been documented.26 Assigning a larger percent Gleason pattern 4 imparts a greater sense of confidence in the amount of Gleason pattern 4 present. For example, a case with 60% Gleason pattern 4 may reflect to clinicians that a tumor is borderline between Grade Groups 2 and 3, whereas a case with 70% to 80% Gleason pattern 4 is more likely to be interpreted as a more definitive Grade Group 3 cancer.
Reporting percent Gleason pattern 4 in cases with highest needle biopsy Grade Group 3 may have the most profound impact in patients electing radiation therapy, as current NCCN guidelines for addition of androgen-deprivation therapy (ADT) after radiation therapy may hinge on grade classification [Grade Group 2 (favorable intermediate risk) = no ADT; Grade Group 3 (unfavorable intermediate risk) = ±ADT].27 This is especially relevant in cases with (1) limited Grade Group 3 carcinoma, (2) cancers borderline between Grade Groups 2 and 3, or (3) highest Grade Group 3 cases with multiple positive cores displaying a mix of grades.
While no study to date has focused on needle biopsy Grade Group 3 cases showing added predictive value beyond standard variables and/or benefit to clinical decision-making, data embedded in more comprehensive studies are mixed. For example, Huang et al20 found that increasing percent Gleason pattern 4 in needle biopsy Grade Group 3 cancers was associated with increased high-grade disease (>Grade Group 3) and higher pT stage at radical prostatectomy. Similarly, Sauter et al28 showed that percent Gleason pattern 4 expressed in quartiles, Grade Group 3 “low” (50%–74% Gleason pattern 4) versus Grade Group 3 “high” (75%–94% Gleason pattern 4), was associated with an increased rate of radical prostatectomy Grade Group 3 or higher.28 Conversely, Cole et al22 found no differences in rates of adverse pathology at radical prostatectomy among needle biopsy Grade Group 3 patients stratified by percent Gleason pattern 4.22
Current College of American Pathologists (CAP) protocols, echoing ISUP 2014 conference recommendations, require reporting percent pattern 4 for Grade Group 3, which is reflected in 74% (92 of 196) of GUPS grading survey respondents who report percent Gleason pattern 4 in this scenario. In biopsies with highest Grade Group 3 (Gleason score 4 + 3 = 7) prostate cancer, 63% (194 of 309) of clinicians reported that it would be valuable to know the quantity of Gleason pattern 4 on needle biopsy. On a practical level, pathologists assigning Grade Group 3 have already decided most of the carcinoma in a core is Gleason pattern 4 and, hence, reporting a percent Gleason pattern 4 should not entail significant additional effort.
Reporting Quantity of Gleason Pattern 4 in Radical Prostatectomy Specimens With Highest Grade Groups 2 to 3 (Gleason Score 3 + 4 = 7, 4 + 3 = 7)
While radical prostatectomy patients with Grade Group 1 disease have a greater than 95% 5-year biochemical risk-free survival, those with Grade Group 5 have a poorer prognosis with 5-year biochemical risk-free survival of 35%.6,29 The outcome of patients with Grade Groups 2 and 3 at radical prostatectomy is more variable, and may depend on the quantity of Gleason pattern 4.28,30,31
Studies of percent Gleason pattern 4 in radical prostatectomy Grade Groups 2 to 3 have yielded a range of findings. Choy et al31 reported patients whose tumors had 21% to 50% Gleason pattern 4 had a 5-year biochemical risk-free survival of 84%, which decreased to 32% in patients with more than 70% Gleason pattern 4. Similarly, Sauter et al28 showed an association between increasing “quartiles” of percent Gleason pattern 4 and decreased rates of biochemical risk-free survival. However, other studies that used models controlling for standard parameters beyond Grade Group alone, including age, PSA, stage, and margin status, have found that percent high-grade tumor volume adds minimal discrimination in prediction of biochemical risk-free survival after radical prostatectomy.32–34 One recent study additionally showed that percent high-grade tumor volume does not add net benefit to clinical management (eg, the decision to administer adjuvant, rather than “early salvage” radiation therapy).34 These latter studies, however, have not focused specifically on Grade Groups 2 to 3.
The practice patterns emerging from the GUPS Grading survey reveal 70% to 80% of respondents currently report percent Gleason pattern 4 in radical prostatectomy specimens with Grade Group 2 to 3. The impetus to report percent pattern 4 on radical prostatectomy is less compelling than on biopsy, because therapeutic decisions would seldom be affected by percent pattern 4 on the surgical specimen. This is reflected in the mixed-clinician response to the GUPS survey in which 55% (219 of 400) said it would be valuable for prognostic or treatment purposes to know the percent Gleason pattern 4 for Gleason score 7 (Grade Groups 2 and 3) at radical prostatectomy.
Optimal Method of Reporting Quantity of Gleason Pattern 4
The 2 general approaches of how to report Gleason pattern 4 used in studies to date are smaller increments (5% or 10%) versus quartile increments (eg, <25%, 25%–49%, 50%–74%, >75%). While a quartile approach may theoretically limit interobserver variability in both needle biopsy and radical prostatectomy specimens, it risks lumping dissimilar patients together (eg, those with 5%–10% versus near 25% Gleason pattern 4), especially in needle biopsy. In the limited studies available, interobserver variability for needle biopsy using a smaller increment approach (<5% and 10% intervals thereafter) was found to be within ±10%, with the caveats that assessment of percent Gleason pattern 4 may be more challenging when Gleason pattern 4 is scattered among pattern 3 (as opposed to clustered) and/or when less than 10% of the core is involved by cancer.35,36 No interobserver reproducibility studies focusing on percent Gleason pattern 4 assignment are currently available for radical prostatectomy specimens.
While some active surveillance cohorts with long-term follow-up have found any Gleason pattern 4 predicts for disease progression, reporting smaller increments (5% or 10% intervals) may be important for centers that include Grade Group 2 patients with low-volume Gleason pattern 4 in active surveillance protocols.37 Studies demonstrating essentially no difference in adverse pathology at radical prostatectomy between needle biopsy Grade Group 2 with small percent Gleason pattern 4 and contemporaneous Grade Group 1 patients may bolster this approach.20,22–24,38,39 In cases with truly minimal quantities of Gleason pattern 4, having to assign a small increment (eg, <5%) should prompt the pathologist to verify the presence of Gleason pattern 4 by examining multiple tissue levels and to exclude tangential sectioning.26 Using smaller increments also allows for calculation of overall percent and total millimeters of Gleason pattern 4 for the whole biopsy, which have been shown in some of the largest studies to correlate with adverse pathology at radical prostatectomy.22,24
Of the methods used to calculate the quantity of Gleason pattern 4 in needle biopsy specimens, the estimated area of Gleason pattern 4 glands divided by total cancer area was strongly preferred (175 of 189; 89%) by GUPS responders compared with using the length of Gleason pattern 4 in millimeters divided by total length of cancer. Of respondents, 161 of 217 (74%) record the quantity of Gleason pattern 4 on needle biopsy for each part separately as percent Gleason pattern 4, while 15% (33 of 217) record overall percent Gleason pattern 4 for all cores combined 1 time per case.
The vast majority (187 of 197; 95%) of respondents to the GUPS pathologist survey reported their practice is to note percent Gleason pattern 4 in smaller increments (rather than quartiles) for needle biopsy specimens, with most reporting either 5% or less or 10% or less as the lowest increment, with roughly 10% increments thereafter. While a minority expressed a preference to report percent Gleason pattern 4 as a continuous variable (eg, a nonincremental value, such as 7% or 13%), given the known sampling error with needle biopsy procedures and existing interobserver variability data, caution is warranted because of risk of introducing a degree of “precision” into an inherently imprecise estimation. Future studies might address interobserver reproducibility in a more robust fashion, including the influence of emerging digital pathology/artificial intelligence solutions, as well as which parameter of Gleason pattern 4 quantitation (eg, maximum percent Gleason pattern 4 in any core, overall percent Gleason pattern 4 or millimeters of Gleason pattern 4 for the entire case) can be best translated into a relevant parameter for clinical practice.40
Reporting Quantity of Gleason Pattern 4 in Needle Biopsy Cases With Very Limited Cancer
Reporting percent Gleason pattern 4 for needle biopsy Grade Group 2 and 3 may be especially challenging in cores with limited cancer volume. For example, in a core with 1-mm cancer involvement, 70% Gleason pattern 4 will yield a Grade Group 3 diagnosis with only 0.7 mm of Gleason pattern 4, while in a core with 10-mm cancer involvement, the same amount of Gleason pattern 4 (0.7 mm) will yield a Grade Group 2 and less than 10% Gleason pattern 4. As such, assessing percent Gleason pattern 4 in small foci of carcinoma can produce an “inflated” or “deflated” Gleason score/Grade Group that may not accurately represent the cancer grade in the prostate gland. In addition, with very small foci of Gleason score 7 cancer, only a few more Gleason pattern 4 glands can radically alter the percent Gleason pattern 4. Up to 70% of the discordance between needle biopsy and radical prostatectomy Gleason score/Grade Group may be seen with small foci of tumor sampled on needle biopsy containing some degree of Gleason pattern 4.41
Because of reproducibility issues in assigning percent Gleason pattern 4 in cores with less than 10% tumor volume, and even a few glands can markedly affect percent Gleason pattern 4 and corresponding Gleason score/Grade Group, some authors choose not to quantify Gleason pattern 4 in this scenario.20,35,36,42 Others, however, record percent Gleason pattern 4 for Grade Groups 2 to 3 regardless of the extent of cancer. This dichotomy is reflected in the practice patterns reported in the GUPS grading survey, with 58% (114 of 197) assigning and 42% (83 of 197) not assigning percent Gleason pattern 4 in needle biopsies with low-volume cancer. These scenarios with limited cancer highlight the potential peril in relying on the highest Gleason score/Grade Group in any one core, or the maximum percent Gleason pattern 4 in any one core, for management decisions, which is typical in the clinical realm. The most important recommendation in such a scenario is the necessity to communicate with clinical colleagues, with some designation in the pathology report reflecting the diagnostic difficulty. One possibility is to assign either a Gleason score 3 + 4 = 7 or 4 + 3 = 7 with a comment that the focus of cancer is too small to accurately assign a percent of Gleason pattern 4. For those pathologists already reporting cancer volume in millimeters, one avenue may be to report directly the millimeters of Gleason pattern 4. Whether to report percent Gleason pattern 4 for needle biopsy Grade Groups 2 and 3 with limited cancer volume requires more data and there was a lack of prevailing practice patterns for GUPS to make a recommendation in this scenario.
A recent study demonstrated similar rates of adverse findings on radical prostatectomy between needle biopsy Grade Group 2 with limited length Gleason pattern 4 and contemporaneous needle biopsy Grade Group 1 cases and that total (cumulative) length of Gleason pattern 4 in all cores was most predictive of adverse pathology.24
Reporting Quantity of Gleason Pattern 4 in Needle Biopsy Cores in Cases With at Least One Part Showing Gleason Score 4 + 4 = 8 (Grade Group 4)
Although reporting of percent Gleason pattern 4 is recommended for Gleason score 7 (Grade Groups 2–3), it is unclear if there is additional value in quantifying extent of Gleason pattern 4 for parts with Gleason scores 3 + 4 = 7 and 4 + 3 = 7 in cases with other parts showing Gleason score 4 + 4 = 8. No study on this issue exists to date. One rationale for reporting extent of Gleason pattern 4 in these high-risk patients is to attempt to more accurately predict upgrading and downgrading from the biopsy compared with the entire prostate. Approximately half of patients with highest grade Gleason score 4 + 4 = 8 on biopsy are downgraded to a lower Gleason score at radical prostatectomy.43–45 There may be an opportunity to improve risk stratification and treatment planning in patients selecting radiation therapy. Current NCCN guideline for addition of intense (18–36 months) versus less intense (4 months) ADT hinges on whether the tumor is Grade Group 4 versus 3.27 It is possible patients with highest Grade Group 4, yet a low amount of overall Gleason pattern 4 disease, as reflected in overall percent Gleason pattern 4 for the case, may represent a subset better suited for less aggressive therapy, similar to treatment of Gleason score 4 + 3 = 7 prostate cancer.46–48
Given the potentially compelling nature of the above rationales, further studies are needed to examine the overall percent Gleason pattern 4 in cases with highest needle biopsy Grade Group 4. On a practical level, if a pathologist routinely assigns percent Gleason pattern 4 for parts containing Grade Group 2 to 3 cancer, there may be value in not changing this practice for patients with highest needle biopsy Gleason score 4 + 4 = 8, both to ensure consistency in practice and to enable clinicians to use the additional information in patient counseling/management. In the GUPS survey, in needle biopsies with highest Grade Group 4 (Gleason score 4 + 4 = 8) prostate cancer, 73% (143 of 196) of pathologists record the quantity of pattern 4 on other cores of lower grade. A majority of clinicians (215 of 308; 70%) in the GUPS survey said it would be valuable for patient counseling or management in the setting on 1 core with 10% Grade Group 4 (Gleason score 4 + 4 = 8) and 3 other cores show Grade Group 3 (Gleason score 4 + 3 = 7) to know if the 3 cores had 60% versus 90% pattern 4.
Reporting Quantity of Gleason Pattern 4 in Needle Biopsy Parts in Cases With at Least One Part Showing Grade Group 5 (Gleason Score 4 + 5 = 9, 5 + 4 = 9, 5 + 5 = 10)
Compared with scenarios with highest needle biopsy Grade Groups 3 to 4, a more substantial debate exists regarding the need to quantify Gleason pattern 4 in highest needle biopsy Grade Group 5 cases containing parts with lower Grade Groups. In the development and subsequent validation of the Grade Group system, no clear prognostic differences were noted among cancers with Gleason score 4 + 5 = 9, 5 + 4 = 9, and 5 + 5 = 10 and, hence, these were all integrated into Grade Group 5.6,9 By NCCN criteria, all patients with needle biopsy Grade Group 5 are labeled high risk, and therefore additional quantification of Gleason pattern 4 (from other parts) is unlikely to change management.27 Rather than reporting percent pattern 4 in cases with at least 1 biopsy part showing Grade Group 5 cancer, there may be future consideration of reporting overall percent Gleason pattern 4 or 5 cancer (currently not recommended).
GUPS survey respondents were split on this issue with 50% (99 of 197) saying that if 1 or more cores show Grade Group 5 (GS 9–10) cancer, they still recorded the quantity of pattern 4 for other cores of lower grade.
WORKING GROUP 2: TERTIARY GRADE PATTERNS
A summary of recommendations on tertiary grade patterns is seen in Table 4.
Definition
A significant number of prostate cancers are composed of more than 2 Gleason patterns, and not infrequently the highest pattern is present in the smallest amount, which many urologic pathologists report as a “tertiary Gleason pattern.” In most studies, the tertiary grade pattern in radical prostatectomy specimens refers to the third most common, highest grade pattern (ie, Gleason pattern 5) comprising 5% or less of a dominant tumor nodule.30,49–58
By convention, a third most common pattern is not used when grading cancer in needle biopsy specimens based on a grading rule introduced at the ISUP Consensus Conference in 2005.4 It has been shown that needle biopsy cases with tertiary Gleason pattern 5 show similar pathologic findings and clinical outcomes as cases with secondary Gleason pattern 5.59 When a tertiary Gleason pattern 5 is found on biopsy, it should be combined with the primary (most common) pattern to derive the overall Gleason score. For example, when Gleason scores 3 + 4 and 4 + 3 are present on biopsy with a lesser amount of Gleason pattern 5, the Gleason scores are 3 + 5 = 8 and 4 + 5 = 9, respectively. In the GUPS pathologist survey, if there were 3 grades on needle biopsy, 94% (192 of 204) reported the Gleason score as combining the most common and highest-grade patterns. Of these respondents, a minority (46 of 204; 2.5%) would also mention the second most common Gleason pattern that was not included into the score. A similar approach is advocated for transurethral resections, although no data exist evaluating this rare scenario.
There are significant variations in the literature in how tertiary grade pattern is applied in radical prostatectomy specimens. A few investigators have used a cutoff of 10% Gleason pattern 5 rather than 5%.60,61 However, others have not used a specific cutoff and have labeled the third most common pattern regardless of its extent as tertiary grade pattern 5.62–66 In some older studies, tertiary-grade pattern was also used to denote a minimal (≤5%) higher-grade pattern in cases showing overwhelmingly 1 lower grade.51,52,54–56,62,63,67 The most common scenario was that of a predominant Gleason pattern 3 tumor with 5% or less Gleason pattern 4, where authors used the term “Gleason score 3 + 3 = 6 with tertiary pattern 4” to denote cases that had a better outcome than in patients with Gleason score 3 + 4 = 7 (defined in these studies as having >5% Gleason pattern 4).51,62,63 Last, in some studies, the term “tertiary grade” was defined as any higher grade representing 5% or less of the cancer, regardless of whether those cases comprised 2 or 3 grades. In such studies, cases of Gleason score 3 + 3 = 6 (Grade Group 1) with 5% or less Gleason pattern 4, and Gleason scores 3 + 4 or 4 + 3 = 7 (Grade Groups 2 or 3) with 5% or less Gleason pattern 5 were all considered together as “tertiary grade pattern.”51,52,54–56,62 In some recent studies, a quantitative approach to account for increasing volume of Gleason pattern 5 has been proposed, although the specific methodology to accurately and reproducibly measure the amount of pattern 5 (as well as other patterns) in a simple manner remains an ongoing challenge (see also Working Group 8). In cases with predominantly pattern 4 and less than 5% Gleason pattern 5, there is controversy as to how grade should be assigned, with some experts advocating grading as Gleason score 4 + 4 = 8 with minor high-grade pattern 5 and others grading as Gleason score 4 + 5 = 9. The studies addressing “tertiary”/minor high-grade patterns have not compellingly addressed this scenario. In the absence of convincing data, in these uncommon cases it is the GUPS consensus to report the grade as Gleason score 4 + 5 = 9, with an option of noting the percentage of Gleason pattern 5 is minor (ie, <5%).
In the GUPS pathologist survey, 78% (165 of 212) of respondents used 5% as the cutoff for tertiary Gleason pattern 5 on radical prostatectomy. This also represents the most validated cutoff point in the literature, supported by evidence-based outcome data.30,49–58 However, clinicians (n = 400) in the GUPS survey were equally split whether the tertiary component had to be 5% or less or could be any amount as long as it was the third most common pattern.
The term “tertiary grade pattern” itself is a source of some confusion. In the hypothetical example of a radical prostatectomy dominant nodule with 50% Gleason pattern 4, 30% Gleason pattern 3, and 20% Gleason pattern 5, Gleason pattern 5 is the third most common pattern and would be the “tertiary” component if one strictly applied the term. However, based on the GUPS pathologist survey and most of the literature, the purpose of denoting a “tertiary” component is to identify cases where Gleason pattern 5 is present in a limited amount. Therefore, GUPS recommends replacing “tertiary grade pattern” with the term “minor tertiary pattern 5,” to be used only in cases with Grade Groups 2 or 3 (Gleason score 3 + 4 = 7 or 4 + 3 = 7) with 5% or less Gleason pattern 5 in a radical prostatectomy. In a dominant tumor nodule with Gleason patterns 3, 4, and 5, and where Gleason pattern 5 is the least amount but exceeds 5% of the cancer, it should be considered as the secondary grade and it should be incorporated into the final Gleason score.
Prognostic Significance of Tertiary Grade Pattern
Determining the prognostic impact of a tertiary grade pattern is limited by the different definitions used for tertiary grade pattern (see above), as well as the different clinical or pathologic endpoints used in the various studies. Most studies clearly show that the presence of a tertiary-grade pattern is associated with a worse prognosis in comparison to cancers within the same Grade Group without a tertiary-grade pattern.30,49–58,60–67 However, the degree to which tertiary pattern 5 affects prognosis is still a matter of debate, with some studies showing cases with tertiary pattern 5 behave similar to those in the next higher Grade Group (ie, Grade Group 2 with minor tertiary pattern 5 behave similar to Grade Group 3), while others show a prognosis intermediate between grades (ie, Grade Group 2 with minor tertiary pattern 5 behave intermediate between Grade Groups 2 and 3).49–51,57,64,65 Despite the above data, approximately an equal percent of clinicians (n = 400) responding to the GUPS survey felt that in radical prostatectomy specimens with Gleason score 7 (Grade groups 2 and 3), a minor (≤5%) Gleason pattern 5 would or would not affect decisions regarding further therapy.
The prognosis of minor tertiary pattern 5 still warrants future adjustments based on large outcome-based studies using the percentages of different Gleason patterns. Such studies will be informative not only in identifying whether there is a most prognostic cutoff point for the highest-grade pattern, but also how to best incorporate this information into a final report that provides the most precise prognostic information to treating physicians and patients.
Reporting Minor Tertiary Pattern 5 With Grade Groups
As there is no consensus whether a minor tertiary pattern 5 worsens the prognosis to the next highest Grade Group, GUPS currently recommends the Grade Group is not changed in this setting. Minor tertiary pattern 5 should be noted along with the Gleason score, and the Grade Group based on the Gleason score (ie, Gleason score 3 + 4 = 7 [Grade Group 2] with minor tertiary pattern 5; Gleason score 4 + 3 = 7 [Grade Group 3] with minor tertiary pattern 5).
WORKING GROUP 3: CASE-LEVEL BIOPSY SCORE: GLOBAL VERSUS HIGHEST
A summary of recommendations on case-level biopsy score: global versus highest is seen in Table 5.
Definitions of “Case-level,” “Global,” and “Highest” Gleason Score
A “case-level Gleason score” is the single, overall Gleason score presumed to be the overall grade for the entire case used for treatment and prognostic purposes. Although there is some variation by geography (see below), in most countries a case-level Gleason score is not assigned by the pathologist at the beginning or end of the prostate biopsy pathology report. For pathologists who assign a case-level Gleason score, there are 2 general approaches. One can report the part (jar) with the highest Gleason score. Alternatively, one could take into account the grades from the different jars, using different methods to derive a case-level score, which we will refer to in this manuscript as a “global Gleason score.” This grading approach is most relevant for cases with multiple positive cores from different sites showing different Gleason scores. Assigning a case-level Gleason score for the entire biopsy series would obviate the need for clinician interpretation in determining which Gleason score to use for patient management or prognostication. This question was discussed at the ISUP 2005 Consensus Conference, but no agreement was reached.4 It was recommended that a separate score be reported for each positive core or for a container with multiple positive cores that are sampled from a specific site and allow clinicians to interpret them as per their practice for patient management. Providing a case-level score was deemed optional, but it was not specified how such a case level score should be generated.
Clinicians' preference to use the ‘highest score' as the case level score has been demonstrated in surveys conducted among urologists in the United States, France and Belgium, and the United Kingdom.68–70 This was reflected in the current GUPS clinician survey in which 72% (266 of 371) report using the part (jar) with the highest Gleason score. This practice likely reflects a concern of not undergrading the case when planning patient management, as the grade present in prostate biopsies may be lower than the Gleason score at radical prostatectomy.4 Given frequent tumor heterogeneity, the highest Gleason score in any core or biopsy part may also suggest a potentially more aggressive tumor. There is, however, the risk of grade inflation with selecting the highest Gleason score in any one part.71
Correlation With Outcome of Highest Versus Global Biopsy Score
It has been debated whether the highest Gleason score/Grade group found in a part versus using a global Gleason score correlates better with the final radical prostatectomy Gleason score and/or is superior in predicting outcome.10,28,72–77 Some studies favoring the use of the highest biopsy Gleason score/Grade Group were based on smaller and limited series that predated the 2005 ISUP Consensus and were based on sextant biopsies, rather than extended biopsies (≥10–14 cores), which are routine in current practice.74,75 Several recent studies have re-examined the practice of using the highest Gleason score as the case-level Gleason score and showed that the global Gleason score is as good or even a slightly superior predictor of outcome and correlated better with the final score on radical prostatectomy.10,28,72,76,77
Current Practice of Case-Level Biopsy Gleason Score Reporting
There are clear geographic differences in how biopsy Gleason scores are assigned. The prevailing practice in the United States is for clinicians to select the highest Gleason score per part as the case-level Gleason score. The various predictive tools (ie, Partin tables and nomograms) that have been validated and proven to be prognostically useful, have used the highest core score in cases where there are multiple cores of different grades.78,79 In contrast, the practice of reporting a global Gleason score is used routinely in some other parts of the world. In 1 study of European genitourinary pathologists, 68% recorded a case level Gleason score, of which 77% used a global Gleason score compared with 17% using the highest Gleason score as the case level Gleason score.80 Global Gleason score is also routinely reported in Sweden and it is widely used in many countries, including Canada and South Korea.7,81–84 The recent 2016 guidelines on screening, diagnosis, and local treatment with curative intent of clinically localized prostate cancer of the European Association of Urology–European Society for Radiotherapy & Oncology–International Society of Geriatric Oncology explicitly stated “ISUP 2014 grade should be given as a global grade, taking into account the Gleason grades of cancer foci in all biopsy sites.”85
The results of the current pathologist survey recapitulate the previously observed geographic differences regarding highest versus global Gleason score assignment. Given the majority of the GUPS survey participants (126 of 230; 55%) were from the United States, it is not surprising a minority of all the survey participants separately reported the highest Gleason score (56 of 211; 26%) or the global Gleason score (49 of 211; 23%), as a case-level biopsy score. While 40% (42 of 104) of the non-United States based GUPS pathologists (mostly from Canada, Europe, and Australia) separately reported a global Gleason score, only a small minority (7 of 109; 6%) of United States participants did so. Similarly, only 16% (17 of 109) of GUPS pathologists from the United States reported a separate highest Gleason score as the case-level Gleason score.
When there is grade heterogeneity in the positive cores from different sites, the principle of adding “the most common and the highest grade,” as a “case-level” global biopsy Gleason score has been used in several large studies from Europe, Australia and the Netherlands, and Canada.77,86,87 This approach follows the ISUP 2005 rule of grading each individual core or separately submitted specimen, by using the sum of the most prevalent and the highest grade in any individual core or separate biopsy part/specimen, yet it applies as if all the positive cores were 1 long positive core. For example, if Gleason pattern 3 is the most common Gleason pattern in the case, 4 is the second most common Gleason pattern, and 5 is the least common Gleason pattern, using the above approach the global Gleason score would be 3 + 5 = 8.
In the current pathologist survey, among the GUPS pathologists who reported a separate global Gleason score on biopsy, the most common methods were (1) “add the most common pattern in the case and the highest pattern in any part” (22 of 49; 45%); (2) “average the grades of all parts together, combining all positive cores as if it was 1 core” (18 of 49; 36%); and (3) “average only the grades from certain parts together based on the location of the tumor” (6/49; 12%). In the remaining (3 of 49; 7.1%) of cases, pathologists used miscellaneous methods of determining the global Gleason score. These results indicate not only a geographic diversity in the use of global Gleason scores, but also a lack of consensus among those who use a global Gleason score as to the optimal method of how to derive the global score.
There are arguments against providing a case-level score based on either a global or highest Gleason score. For example, frequent multifocality of prostate cancer would be an argument against using a global Gleason score. If the cancer found on prostate biopsy reflected sampling of a single tumor nodule, then it would be reasonable to assign a global Gleason score as if all the positive cores represented 1 long positive core. However, in the setting of multifocal cancer of different grades, assigning a global Gleason score is potentially problematic. For example, in a case with a nodule of Gleason score 4 + 4 = 8 on the left side of the prostate (1 core positive) and a separate nodule of Gleason score 3 + 3 = 6 on the right side of the prostate (2 cores positive). The global Gleason score on biopsy would be 3 + 4 = 7, according to the 2 most common methods of calculating global scores, which may underrepresent the biologically relevant cancer in the gland and result in variation in treatment recommendations and prognostication. If one used the less commonly used definition of global scores averaging only the grades from certain parts together based on the location of the tumor, an accurate grade would have been assigned on the above hypothetical case. However, in many cases when using standard biopsy techniques, it is not evident if separate nodules are sampled, and which positive core corresponds to which nodule. This introduces another level of subjectivity and interobserver variability beyond that associated with grading itself.
Assigning the highest Gleason score as the case-level Gleason score also is problematic in certain cases. Assume a single nodule of Gleason score 4 + 3 = 7 where most of the cores are the same score. However, there is one core that sampled only a small focus of the Gleason pattern 4 component, resulting in a core with Gleason score of 4 + 4 = 8. In this hypothetical case, the case level based on the highest grade would be inaccurate.
In summary, based on the review of the literature and the survey results, the practice of reporting global and/or highest Gleason score as a “case-level” biopsy score remains a reporting option, with clear differences based on geography. More contemporary studies are needed to resolve this question factoring in imaging results, especially how and whether magnetic resonance imaging (MRI)-targeted biopsies of the prostate can improve the accuracy of the grade assigned to a case.
MRI-Targeted Biopsies of the Prostate
MRI-targeted biopsy of the prostate is superior to systematic prostate biopsy for identifying high-grade prostate cancer, although in a substantial minority of cases the targeted biopsy misses the highest grade cancer that is detected by the systematic biopsies.88,89 Truong et al,90,91 however, found the visibility of cribriform tumors was lower than that of other architectural patterns across all tumor sizes, and only 17.4% of cribriform tumors in pure form were visible on multiparametric MRI. As MRI-targeted biopsy is becoming more widely used in routine practice, it is necessary for pathologists to adopt standardized reporting in this setting. Unlike the standard prostate biopsy that uses a systematic but blind approach to tissue sampling, MRI-targeted biopsy acquires multiple cores of tissue from 1 or more lesions suspicious for prostate cancer by imaging. Urologic and radiologic societies have created guidelines for the analysis, reporting, and utilization of prostate MRI and MRI-targeted biopsies.92–97 An international consensus on the standards of reporting for MRI-targeted prostate biopsy studies recommended separate reporting for the standard systematic and targeted cores, specifically regarding the Gleason score and the cancer extent.94
Sampling of Targeted Lesions and Grade Heterogeneity
Numerous studies have evaluated the relevance of the number of cores taken during MRI-targeted biopsy in regard to prostate cancer detection and tumor heterogeneity. The consensus of most studies is that sampling more than 1 core provides a significant cancer detection benefit.98–104 Some studies have recommended sampling up to 5 cores per MRI suspicious lesion.102 In addition, it was shown that sampling multiple cores potentially enhances detection of tumor heterogeneity.99,101,103 A consensus statement by the American Urological Association and Society of Abdominal Radiology recommended that at least 2 cores are obtained from each MRI-suspicious lesion.100 In order to prevent tissue fragmentation and to ensure the quality of histologic assessment, it was also recommended no more than 2 cores are placed in a single container.105–108
Tumor Grading and Global Score in MRI-Targeted Biopsies
As discussed above, clinical surveys have shown the majority of clinicians used the highest Gleason score and tumor extent in standard systematic prostate biopsies to guide their treatment plan.68,69 When multiple cores are taken from a single MRI-targeted lesion, assessing each core from that location individually, as opposed to evaluating all the targeted cores from that location in aggregate, could result in differences in reporting cancer grade and extent.70,96,109,109 Gordetsky et al109 compared an aggregate reporting method versus an individual core reporting method when multiple cores were taken from a single MRI-targeted lesion. The study found the aggregate approach better correlated with tumor volume and extraprostatic extension but was inconclusive regarding which method better correlated with the final grade on radical prostatectomy. The paucity of available data is reflected in the GUPS pathologist survey, which showed an equal split between the approaches of grading separately each individual positive core sampled from an MRI-targeted area (103 of 211; 49%) versus averaging the grade of all positive cores from a given target (108 of 211; 51%). However, 76% (281 of 371) of clinicians in the GUPS survey were of the opinion that a grade should be assigned to each individual positive core in MRI-targeted biopsies showing multiple positive cores with different grades. In MRI-targeted biopsies showing positive cores in both the targeted area(s) and in systematic biopsies, when different scores are found, clinicians (n = 371) were equally divided as to whether (1) each part (jar) should be graded separately and give a single global score factoring both standard and targeted biopsy results; (2) each part (jar) should be graded separately and give global scores separately for systematic and targeted biopsies; or (3) each part (jar) should be graded separately without assigning a global score.
For systematic biopsies that have multiple undesignated cores with cancer in 1 container, the current standard of practice is to provide an overall grade for the cores in the specimen container. Following this practice, and until more studies are available to clarify the issue, we recommend reporting the same approach for grading when multiple cores are taken from a single MRI-targeted lesion.
The question of how and whether to report a global score in cases when positive biopsies with different scores are found both in the standard and the MRI-targeted biopsy cores has not been previously addressed. Of the GUPS survey respondents who report global scores, 61% (30 of 49) of pathologists would grade each part (jar) separately and give a single global Gleason score factoring both the systematic and the targeted positive biopsies, while 26.5% (13 of 49) would grade each part separately and give separate global scores for the systematic and targeted biopsies.
For those who report a case-level Gleason score, increased use of MRI-targeted biopsies could also have implications on whether the highest Gleason score per part or a global Gleason score is more accurate as the case-level grade. MRI-targeted biopsies tend to better sample the highest-grade tumor in the gland, and likely the highest Gleason score per part in these targeted biopsies will more accurately reflect the overall grade of the tumor in the prostate. Alternatively, a global score based on the specific location of positive cores correlated with the visualized tumor nodule(s) on imaging may prove a superior case level score.
WORKING GROUP 4: UPDATE ON GRADE GROUPS
A summary of recommendations on update on Grade Groups is seen in Table 6.
Nomenclature of Grade Groups
In 2013, the group at the Johns Hopkins Hospital proposed a new patient-centric grade grouping system, which more accurately reflects outcomes and prognosis compared with the Gleason score.9 This study has been subsequently validated by a larger multi-institutional cohort and subsequently by many additional studies correlating Grade Groups with biochemical recurrence, distant metastases, and death following biopsy, radical prostatectomy, and radiation therapy.6,110 “Grade Group” nomenclature has been accepted by the American Joint Committee on Cancer, CAP, ISUP, WHO, and other international associations and organizations.5,6,18,111–118 In the current survey, 95% (200 of 211) of GUPS members reported both Gleason scores and Grade Groups, and only 5% (11 of 211) reported Gleason scores only. Among clinicians who responded to the GUPS survey, 65.5% (243 of 371) report discussing prostate cancer with their patients using Grade Groups and Gleason score together, while 27.5% (102 of 371) only use Gleason score and 7% (26 of 371) use only Grade Groups. The American Joint Committee on Cancer, WHO, and CAP all use “Grade Groups” as the accepted terminology. While some have used alternative nomenclature, such as “Gleason Grade Groups,” “ISUP Grade Groups,” or “WHO Grade Groups,” none of these accurately reflect the origin of the Grade Group concept nor the validation studies performed. Rather, groups such as ISUP and the WHO Panel on Genitourinary Tumors have been influential in endorsing Grade Groups and helping to establish its usage. As such, GUPS recommends the continued use of the simple terminology “Grade Groups” as a unifying language for reporting.
Utilization Patterns of Grade Groups and Equivalent Gleason Scoring Grouping
Grade Groups were in part developed because of anecdotal evidence that Gleason scores were incorrectly combined in the literature. To examine how published studies categorized Gleason Scores in current practice, a study was conducted on 1576 articles published in 2016 to 2017.119 One issue was whether Gleason scores 3 + 4 = 7 and 4 + 3 = 7, despite different prognoses, were combined together and considered as only Gleason score 7. In that review, pathology journals (57%) were more likely than non-pathology journals (40%) to separate Gleason score 7 into 3 + 4 = 7 and 4 + 3 = 7. Articles co-authored by a pathologist separated Gleason score 7 (53%) more frequently than those without a pathologist (33%) and clinical articles (44%) separated Gleason 7 more than research articles (33%).
A second issue was whether Gleason scores 8 and 9 to 10, despite very different prognoses, were combined together and considered as high-grade cancer (ie, Gleason scores 8–10). In the same review, pathology journals (55%) separated Gleason score 8 from Gleason scores 9 to 10 more than non-pathology journals (34%), while articles co-authored by a pathologist separated Gleason score 8 (45%) from Gleason scores 9 to 10 more often than those without a pathologist (29.5%).119
Gleason scores in the literature in 2016 were grouped into the 5 Grade Group equivalents in only 11.8%, which rose to 34.4% in 2017 after publication of the Grade Group system. However, approximately one third of scientific articles in 2016 and 2017 still grouped Gleason score according to the inaccurate NCCN/D'Amico classification of 6 or less, 7, and 8 to 10.119
Gleason Scores 3 + 5 and 5 + 3: Which Grade Group Do They Belong To?
One of the major practice pattern recommendations that emerged from the ISUP 2005 consensus conference on Gleason grading of prostatic adenocarcinoma was the discontinuation of tertiary Gleason grades/patterns on prostate needle core biopsies when 3 different Gleason grades/patterns are present.4 The current recommendation is that if the highest Gleason grade/pattern on biopsy, even when it is the least prevalent “tertiary pattern,” should be included as the secondary grade pattern in the final Gleason score.4,64,120,121 However, there are questions as to what is the most appropriate Grade Group to assign to the uncommon cases that have a Gleason score of 3 + 5 = 8 or 5 + 3 = 8 (Figure 1). Currently, Grade Group 4 includes Gleason scores 4 + 4 = 8 (most common scenario), 3 + 5 = 8, and 5 + 3 = 8 (least common scenario).
It is important to put into perspective the low frequency rate of Gleason scores 3 + 5 = 8, with Gleason score 5 + 3 = 8 being even more rare. In a large multi-institutional study, including genitourinary pathology experts, of 20 845 radical prostatectomy specimens from 2005 to 2014, there were only 39 (0.2%) cases with Gleason scores 3 + 5 = 8 and only 4 (0.02%) with Gleason score 5 + 3 = 8.6 Similarly, of 16 172 needle biopsy cases, there were only 44 (0.3%) with Gleason score 3 + 5 = 8 and only 6 (0.04%) with Gleason score 5 + 3 = 8. In a more recent review, of 14 359 biopsies with cancer and 6727 radical prostatectomy specimens, only 1 case (0.007%) and no cases (0%) were graded as Gleason score 5+3=8, respectively, after re-review.16 A few studies have demonstrated that Gleason score 3 + 5 = 8 has a similar prognosis to Gleason score 4 + 4 = 8.57,86,122,123 One such study of 428 prostate needle core biopsy cases (only 5 patients had Gleason score 5 + 3 = 8 and were excluded from the study) demonstrated similar prostate cancer related survival in patients with Gleason score 3 + 5 = 8 and Gleason score 4 + 4 = 8.122 In addition, the metastatic rate in patients with Gleason score 3 + 5 = 8 was similar to patients with Gleason score 4 + 4 = 8. The only difference was a marginally statistically significant higher nonmetastatic rate (after radiation/hormone therapy) among Gleason score 3 + 5 = 8 cases compared with Gleason score 4 + 4 = 8 cases. Similarly, Lu et al123 found no statistically significant difference in the prostate cancer specific mortality between Gleason scores 4 + 4 = 8 and 3 + 5/5 + 3 = 8. They also found Gleason score 9 had a significantly higher all-cause mortality and prostate cancer specific mortality than Gleason 3 + 5/5 + 3 = 8. In contrast, van den Bergh et al86 found patients with biopsy and radical prostatectomy Gleason scores 3 + 5 = 8 had a more favorable prognosis than Gleason score 4 + 4 = 8 cancers, and more comparable to those with Gleason score 4 + 3 = 7.
Mahal et al124 analyzed Surveillance, Epidemiology, and End Results Program (SEER) data for patients with Gleason score 3 + 5 = 8 and Gleason score 4 + 4 = 8 disease and found similar disease-specific death rates.124 In contrast, this same analysis demonstrated that Gleason score 5 + 3 = 8 cancers had a 2-fold increase in disease-specific death rates when compared with Gleason score 4 + 4 = 8 cases.124 A major weakness associated with SEER data is the level of consistency and accuracy of Gleason grading cannot be guaranteed because the vast majority of the cases were not graded by urologic pathologists. Other authors have suggested subcategorizing Gleason score 8 prostatic adenocarcinoma into cases with or without Gleason pattern 5 when stratifying patients into randomized trials.125
Based on the above data, there is a compelling case to retain Gleason score 3 + 5 = 8 as Grade group 4. However, it is less clear at this point, in large part because of its rarity, whether Gleason score 5 + 3 = 8 should be more appropriately categorized as Grade Group 4 or 5.
Gleason Scores 4 + 5, 5 + 4, and 5 + 5: Are They All Grade Group 5?
High-grade prostate carcinomas have historically been considered Gleason scores 8 to 10. However, it is now clear that patients with Gleason score 8 have a significantly better prognosis than those with Gleason scores 9 to 10 and this difference is reflected in the Grade Group system.9 However, Grade Group 5 is heterogeneous and includes Gleason scores 4 + 5 = 9, 5 + 4 = 9, and 5 + 5 = 10. Some have questioned whether there should be a distinction between the different Gleason scores in Grade Group 5. It is challenging to evaluate if there are differences among these 3 scores because Grade Group 5 corresponds to only 4% to 10% of all prostate carcinoma, and because most papers do not discriminate between these scores.9,57,126 Historically, Grade Group 5 patients typically presented with advanced stage, where surgery was not the preferred treatment choice.127–129 However, some studies have shown that surgery might be beneficial for a subgroup of these patients.130,131 If true, the separation of Grade Group 5 into different subgroups may have a predictive value for surgical treatment.
There have been very few studies on this issue. Wissing et al57 demonstrated that Gleason score 4 + 5 = 9 had a lower treatment failure rate following radical prostatectomy (82%) when compared with Gleason score 5 + 4 = 9 (97%). Flood et al132 also analyzed a small cohort of 49 men who underwent radical prostatectomy with Grade Group 5 prostate cancer. They analyzed percent Gleason pattern 5 and a large range of high-grade tumor morphologies, many of which were significantly associated with biochemical recurrence. The study may have been confounded by the inclusion of intraductal carcinoma as Gleason pattern 5, which may affect the prognosis in Grade Group 5. In a study performed in an radiation therapy cohort, the rate of distant metastasis was significantly higher in needle biopsy Grade Group 5 patients with primary Gleason pattern 5 compared with 4 + 5 = 9.48 This subtlety has been incorporated into the most recent NCCN Guidelines, as a “very high risk” category [NCCN], although management currently does not differ for this subgroup.27
Even if differences between Grade Group 5 with secondary (4 + 5 = 9) versus primary (5 + 4 = 9/5 + 5 = 10) pattern 5 are validated in future studies, it is uncertain whether they warrant splitting Grade Group 5 into 2 groups. From the practical standpoint, the prognosis for both Gleason score 4 + 5 = 9 or 5 + 4 = 9 is very poor and in most cases would not affect choice of therapy. By assigning a separate Gleason score and Grade Group to such cases, those with primary Gleason pattern 5 could still be recognized and may be useful for clinical trials, novel therapy, and so on, without the need for altering the overall Grade Group.
This may be especially relevant in light of significant interobserver reproducibility in the grading of Gleason pattern 5.133 Several studies have shown that Gleason pattern 5 is underrecognized by general pathologists in approximately 50% of cases making a determination of Gleason score 4 + 5 = 9 versus 5 + 4 = 9 even be more difficult.134,135
WORKING GROUP 5: CRIBRIFORM CARCINOMA
A summary of recommendations on cribriform carcinoma is seen in Table 7.
Introduction
The term “cribriform” is derived from the Latin word cribrum (ie, sieve) and was first introduced in the prostate cancer lexicon by Gleason to describe glands composed of sheets of tumor cells that form cohesive rounded or irregularly shaped trabeculae with perforations or multiple “punched out” lumina.136 Cribriform glands may be spherical or oblong, and may sometimes have irregular borders. Although the cribriform pattern has the best interobserver reproducibility among genitourinary pathologists (ranging from 54%–79%) when compared with other Gleason 4 patterns, there is still significant variability.42,137
Large Versus Small Cribriform Pattern
Small cribriform glands, which were part of Gleason pattern 3 in the ISUP 2005 Gleason modification and consisted of rounded cribriform glands equal in size to benign glands, are now included as Gleason pattern 4.4,5 Iczkowski et al139 were the first to study the significance of cribriform patterns according to their size. They defined large cribriform pattern as having more than 12 luminal spaces, with cribriform to focal solid foci, while small cribriform glands had fewer than 12 luminal spaces without any solid foci. These criteria were subsequently used by McKenney et al140 and Keefe et al141 in their studies to distinguish large versus small cribriform glands. An area exceeding the size of an average benign gland was used by Trudel et al142 to describe large cribriform glands, while Hollemans et al143 defined expansile/large cribriform glands to have a diameter of at least twice the size of adjacent benign glands. The clinical significance of the distinction between small and large cribriform pattern is unclear (Figure 2, A through F). Although Hollemans et al143 in their study showed that the large cribriform pattern was associated with a worse clinical prognosis compared with small cribriform pattern, Keefe et al141 failed to show any correlation between the size of the cribriform pattern and upgrading or stage at radical prostatectomy. Iczkowski et al139 did not find a correlation between cribriform gland size and biochemical recurrence after prostatectomy. Further studies are clearly required to determine the prognostic significance of the distinction of small and large cribriform carcinoma.
Definition of Glomeruloid Pattern
The glomeruloid pattern is described as a dilated gland containing intraluminal cribriform structures with typically a single point of attachment, resembling a renal glomerulus.144 Glomeruloid glands may be small or large (Figure 2, E and F). Many tumors with the glomeruloid pattern show a spectrum of these patterns and also show transition into small and large cribriform glands, with some pathologists considering the large glomeruloid structures as a morphologic variant of the large cribriform pattern.31,144 However, a recent three-dimensional analysis of prostate cancer glands using confocal laser scanning microscopy did not show a continuum between these 2 patterns.145 In a case-control study by Kryvenko et al,146 which analyzed Grade Groups 2 and 3 cancers at radical prostatectomy with metastasis, glomeruloid Gleason pattern 4 was found to be more favorable and less likely associated with regional lymph node metastasis.
Reporting of Cribriform Pattern on Needle Biopsy
Several retrospective studies have investigated the clinical significance of reporting cribriform growth in biopsy and radical prostatectomy specimens. Three studies in particular, all by Kweldam et al,147 have demonstrated significant clinical associations for cribriform architecture and/or intraductal carcinoma identified in core biopsy specimens. In 1 study, the presence of invasive cribriform or intraductal growth in a biopsy specimen outperformed reporting percentage pattern 4 in terms of predicting biochemical recurrence after radical prostatectomy and/or radiation therapy in patients with Grade Group 2 prostate cancer. In the second study, biopsy cases with cribriform/intraductal morphology associated with worse disease-specific survival than those with similar Gleason scores that did not contain either of these growth patterns.148 In the third study, Kweldam et al149 demonstrated that men with biopsies showing Grade Group 2 prostate cancer on core biopsy without cribriform/intraductal carcinoma have similar biochemical recurrence free survival after radical prostatectomy and radiation therapy as those with Grade Group 1 disease. A recent review in 2019 by Kweldam et al138 summarized these findings. A significant problem with the studies by Kweldam et al is patients were from the European Randomized Study of Screening for Prostate Cancer, which began in 1994 where patients underwent only a 6 core (sextant) biopsy, and the data may not be applicable to contemporary prostate cancer grading. Men with Grade Group 1 (Gleason score 3 + 3 = 6) sampled by sextant biopsy may have a significant undersampling and high risk of having more aggressive, nonsampled cancer in their prostate. Consequently, their baseline comparison of Grade Group 2 prostate cancer on core biopsy without cribriform/intraductal carcinoma to Grade Group 1 cancer is flawed, as the Grade Group 1 they used as a control for comparison, may not accurately reflect contemporary Grade Group 1 cancers. The authors also do not distinguish between intraductal carcinoma and cribriform carcinoma (see below). By combining cribriform carcinoma and intraductal carcinoma into 1 group, the authors cannot evaluate whether the adverse prognosis in cases with cribriform carcinoma was not driven by the concurrent intraductal carcinoma.
Flood et al42 found the overall extent of Gleason pattern 4 on TRUS-guided biopsy was strongly associated with upgrading and upstaging, and that cribriform morphology correlated to the overall extent of pattern 4. The presence of cribriform morphology showed near-perfect interobserver agreement whereas ill-defined glands, fused glands, and glomerulations were not highly reproducible among observers. In the Keefe et al study, cribriform morphology detected on biopsy in patients with Gleason score 3 + 4 = 7 prostate cancer was associated with tumor upstaging at the time of RP.141
In a more recent study, Masoomian et al150 also showed patients with cribriform and intraductal carcinoma on biopsy specimens have more advanced pathologic stage at radical prostatectomy, independent of grade. In sum, all of these studies that included only biopsy specimens suggest patients who have cribriform architecture and/or intraductal carcinoma may not be good candidates for active surveillance, although to date no studies have specifically investigated the impact of cribriform glands in an actual active surveillance population that was prospectively followed.
In prostate cancer patients treated with external beam radiotherapy, 1 recent study assessed the impact of cribriform architecture and/or intraductal carcinoma in men with Grade Groups 2 and 3 on biopsy.151 On multivariable analysis, only cribriform pattern with intraductal carcinoma was associated with inferior distant metastasis-free survival and disease-specific survival. The authors concluded that cribriform pattern with intraductal carcinoma was associated with adverse outcomes in men with Gleason 7 prostate cancer treated with external beam radiotherapy while cribriform pattern without intraductal carcinoma was not so associated. Future studies may benefit from more clearly separating these 2 histologic entities when assessing various outcomes.
Reporting of Cribriform Pattern on Radical Prostatectomy
Among patients with Gleason score 7 cancer, Choy et al31 demonstrated the presence of cribriform architecture was associated with decreased 5-year biochemical-free survival when compared with Gleason score 7 cancers without this architecture. Although Gleason score 7 disease having only glomeruloid architecture had significantly lower 5-year biochemical-free survival than Gleason score 6 cancers, the prognosis of the patients with glomeruloid glands was not as adverse when compared with those with cribriform architecture. Hollemans et al143 reported that large invasive cribriform architecture (>2 times the size of adjacent benign glands) was associated with a higher percentage of Gleason pattern 4, extraprostatic extension, and more frequent lymph node metastases when compared with small invasive cribriform lesions and/or intraductal carcinoma. Iczkowski et al139 reported that among Gleason 4 patterns, both large and small cribriform glands were associated with biochemical failure. Large cribriform lesions were also an independent predictive factor for biochemical recurrence-free survival in Grade Group 2 prostate cancer patients.143
Dong et al152 analyzed different Gleason pattern 4 morphologies (poorly formed glands, fused glands, and cribriform pattern) in radical prostatectomies and found cribriform gland pattern was an independent predictor of biochemical recurrence and metastasis. In a nested case-control study of radical prostatectomy specimens, Kweldam et al153 found cribriform pattern to be an adverse independent predictor for distant metastasis-free survival and disease-specific survival, while fused, ill-defined, and glomeruloid patterns were not. In a later study, the same investigators found invasive cribriform and/or intraductal carcinoma were associated with postoperative biochemical recurrence, whereas percent of Gleason pattern as a continuous parameter was not.147 Trudel et al142 also investigated the significance of cribriform growth in radical prostatectomies and analyzed large cribriform growth and intraductal carcinoma separately; they demonstrated that any amount of large cribriform or intraductal carcinoma was associated with increased biochemical recurrence, but when analyzed separately, no difference in prognosis was seen between large cribriform growth and intraductal carcinoma. In a more recent study by Hollemans et al,143 the authors separately analyzed both large and small cribriform patterns, as well as intraductal carcinoma, in radical prostatectomy specimens and found large cribriform morphology (in addition to positive surgical margins and pathologic T stage) was an independent predictor of biochemical recurrence on multivariate analysis, whereas small cribriform growth, intraductal carcinoma, and percentage Gleason pattern 4 were not. McKenney et al140 found cases with any cribriform pattern (either small or large/expansile) were associated with lower recurrence-free survival than those with poorly formed glands, leading the authors to recommend that Grade Group 2 carcinomas with cribriform architecture should be assigned a separate prognostic group from those without it.
Detection of Cribriform Glands on Multiparametric MRI
A few studies have investigated the ability of multiparametric MRI to detect cribriform and intraductal prostate cancer, which would be particularly useful in patients with Grade Group 1 and 2 prostate cancers on biopsy who may be considering active surveillance. In a study by Tonttila et al,154 multiparametric MRI detected cancers with cribriform or intraductal growth with a sensitivity of 90.5%. Hollemans et al143 investigated a series of patients with Grade Group 2 on biopsy with matched radical prostatectomy specimens and found of the 29 PIRADS 5 lesions, 3 (10.3%) were negative for cribriform glands on biopsy and prostatectomy; 9 (31.0%) were negative for cribriform glands on biopsy yet positive in prostatectomy; and 17 (58.7%) had cribriform glands on biopsy and prostatectomy. Prendeville et al155 found multiparametric MRI with a fusion targeted biopsy was significantly associated with increased detection of cribriform/intraductal carcinoma compared with systematic sextant biopsy of multiparametric MRI-negative region. These recent studies further support the NCCN guidelines recommendation to consider performing a multiparametric MRI when considering patients for active surveillance.
Relation of Cribriform Glands to Molecular Hallmarks of Aggressive Disease
Recent literature also demonstrated cribriform and intraductal growth are associated with increased genomic instability and other molecular features that are typically found in aggressive, lethal prostate cancers.156–159 These molecular findings are certainly concordant with the clinical observations of cribriform growth pattern being associated with adverse outcomes. A recent study from Greenland et al160 revealed the expansile cribriform, simple cribriform, poorly formed, and fused patterns of Gleason pattern 4 were all associated with a higher Genomic Prostate Score (Oncotype Dx). The expansile cribriform pattern, defined as solid large acini with greater than 12 luminal spaces, which also included intraductal carcinoma, had the strongest association with the Genomic Prostate Score, while the glomeruloid pattern was associated with a lower Genomic Prostate Score.
Conclusions
As already discussed in relation to specific studies, there are significant issues with some of the studies on cribriform glands, including (1) different definitions as to what constitutes the adverse morphology in cases with cribriform glands (ie, large versus small cribriform; definition of large cribriform); (2) noncontemporary patient populations sampled by sextant (6 core) biopsies; (3) and not distinguishing between intraductal carcinoma and cribriform carcinoma.
Possibly, reflecting these uncertainties as well as lack of full familiarity with the literature, according to the GUPS pathologist survey, only 49% (104 of 211) of participants report whether the Gleason pattern 4 component on needle biopsy cases with Gleason score 7 (Grade Groups 2 and 3) showed a cribriform morphology. There was no correlation of reporting with years in practice or academic institutions versus community hospitals. Whereas only 40% (46 of 116) of the US-based GUPS respondents report whether cribriform glands are present on needle biopsy, 76% (38 of 50) of the respondents from Canada, Europe, and Australia report this finding (P < .001). Approximately, 44% (19 of 43) of respondents from Asia and Mexico and Central and South America report whether cribriform pattern 4 is present. Of all pathologists who report the presence of cribriform morphology on needle biopsies with Gleason score 7 (Grade Groups 2 and 3), 88% (88 of 100) report present versus absent, and only 12% (12 of 100) report as a percent of the Gleason pattern 4 cancer. When asked “If you distinguish between small and large cribriform glands on needle biopsy, what criteria do you use,” 77% (79 of 103) replied they do not distinguish between small and large cribriform glands. For those participants who did distinguish based on size, 54.5% (12 of 22), 32% (7 of 22), and 13.5% (3 of 22) define large cribriform glands as 2× size of adjacent benign glands, spanning the width of the needle core, or 12 or more lumens, respectively. Whether to report size of cribriform glands and if so, the size criteria requires more data along with a lack of prevailing practice patterns such that no GUPS recommendations on this issue can be made.
In the GUPS survey, 57% (121 of 211) of pathologists did not report whether the Gleason pattern 4 in Gleason score 7 (Grade Groups 2 and 3) was cribriform in radical prostatectomy specimens. There was a difference in reporting based on geography with only 33% (53 of 159) of GUPS participants from the United States, Mexico, and Central and South America reporting cribriform glands in this setting compared with 72% (36 of 50) of respondents from Canada, Europe, and Australia (P < .001).
In the GUPS clinical survey, for cases with Gleason score 7 (Grade Groups 2 and 3), 47% (175 of 371) responded that knowing if the Gleason pattern 4 includes cribriform glands would affect patient counseling or management. Of 58.5% (217 of 371) of clinician respondents who consider active surveillance in men with Gleason score 3 + 4 = 7 (Grade Group 2) with more than a 10-year life expectancy, 65% (139 of 215) responded that whether the Gleason pattern 4 included cribriform glands impacted their decision.
Notwithstanding the above issues, the vast majority of studies on prostate cancer with cribriform architecture, whether inclusive of intraductal carcinoma or not, demonstrate associations between these prostate cancers and both adverse clinical outcomes and molecular features typically seen in advanced disease. Based on these findings, GUPS recommends reporting the presence or absence of cribriform glands in biopsy and radical prostatectomy specimens with Gleason pattern 4 carcinoma. Given that reporting cribriform glands in Gleason pattern 4 is currently not standard of care, there may be variable adoption of this recommendation based on geographic, institutional, and clinician preference. Additional contemporary studies where cribriform glands are better and more uniformly defined are still needed to further characterize the effect of cribriform glands on prognosis and subsequent therapeutic decisions.
WORKING GROUP 6: INTRADUCTAL CARCINOMA (IDC-P)
A summary of recommendations on intraductal carcinoma is seen in Table 8.
Diagnostic Criteria of IDC-P and Atypical Intraductal Proliferation
IDC-P is an intra-acinar and/or intraductal neoplastic epithelial proliferation that exhibits greater architectural and/or cytologic atypia than high-grade prostatic intraepithelial neoplasia (PIN). It is considered a distinct entity in the 2016 WHO classification of tumors of the prostate.18
Several investigators have attempted to refine the diagnostic criteria so the diagnosis of IDC-P can be reproducible.161–163 Guo and Epstein162 proposed a set of morphologic criteria for IDC-P on needle biopsy, which are currently the most commonly criteria used to diagnose IDC-P in all types of prostate specimens. In the pathologist survey, 100% (211 of 211) of respondents used these criteria. In the setting of enlarged, atypical glands with at least a partially retained basal cell layer, the major criteria include (1) solid growth filling the gland or dense cribriform architecture (the latter is defined as >50% of the gland is composed of epithelium relative to luminal spaces); or (2) loose cribriform (<50%) or micropapillary architecture having either marked pleomorphism/nucleomegaly or comedonecrosis (Figures 3, A through F and 4, A through D). In the initial definition proposed by Epstein and Guo,162 marked nucleomegaly was defined as more than 6 times normal. There has been some controversy whether this reflects diameter or area. In practice, this was meant to reflect the rare case of IDC-P lacking solid or dense cribriform architecture, yet showing marked variation in size and shape of nuclei that far exceeds that seen in high-grade PIN. A specific size criterion is not required but should reflect bizarre pleomorphic cytologic atypia (personal communication Jonathan Epstein, MD). Some pathologists use area more than 6 times of adjacent benign nuclei. The presence of any one of these criteria is considered diagnostic for IDC-P. Minor criteria that have been proposed by others include (1) involvement of many glands (> 6); (2) involved glands are irregular or branching at right angles; (3) easily identifiable/frequent mitoses, and (4) “central maturation”—2 distinct cell populations consisting of tall, pleomorphic, mitotically active cells at the periphery and cuboidal, monomorphic, quiescent cells at the center.161,163 Solid intraluminal growth filing the gland is the most reproducible feature for IDC-P.164 Some uncertainty remains about the definition of “dense” cribriform pattern, with recommendations for both 50% and 70% epithelium relative to luminal spaces existing in the literature.165 The survey revealed that a slight majority of GUPS members (122 of 211; 58%) used more than 50% as opposed to more than 70% criterion in diagnosing dense cribriform IDC-P. All diagnostic criteria for IDC-P should be applied particularly stringently in biopsy specimens, especially in the absence of invasive prostate cancer, as some would proceed to definitive treatment even when IDC-P is the only finding, in the absence of invasive prostate cancer. To achieve this goal there should be definitely more epithelium relative to luminal spaces. In cases where they are approximately equal, it is prudent to be conservative and diagnose the lesion as not meeting full criteria for IDC-P.
An intraductal proliferation of prostatic secretory cells may occasionally show a greater degree of architectural complexity and/or cytologic atypia than typical high-grade PIN, yet falling short of the strict diagnostic threshold for IDC-P (Figure 5, A through F).166 The terms “atypical cribriform lesion,” “atypical intraductal cribriform proliferation,” “low-grade intraductal carcinoma,” and more recently “atypical proliferation suspicious for intraductal carcinoma” have all been proposed to designate this spectrum of proliferations.163,167,168 The term “atypical intraductal proliferation (AIP)” is preferred as it generally includes all morphologic patterns that have been described.168 In the pathologist survey, 50% (105 of 211) of respondents used AIP, 32% (68 of 211) were descriptive as IDC-P versus high-grade PIN, while 13% (28 of 211) would assign cribriform high-grade PIN. As AIP is not used by a large majority of pathologists, it would even be less understood by clinicians. When “AIP” is used in a biopsy report, there should be some comment that AIP has some but not all the features of IDC-P (ie, suspicious for but not definitive for IDC-P).
AIP is typically characterized by loose cribriform proliferations that lack intraluminal necrosis and/or severe nuclear atypia required for the diagnosis of IDC-P.168–170 The most common diagnostic scenarios include (1) intraluminal proliferation with loose cribriform architecture (more luminal spaces relative to epithelium), but without significant nuclear pleomorphism or necrosis; (2) solid or dense cribriform structure incompletely spanning the glandular lumen; or (3) any lesion, regardless of the architecture, with significant nuclear atypia or pleomorphism beyond high-grade PIN, but falling short of the current diagnostic criteria for IDC-P. The great majority (>90%) of AIP cases demonstrate loose cribriform architecture.168
High-grade PIN represents the most significant and clinically relevant differential diagnosis, particularly in a prostate biopsy. High-grade PIN glands are typically smooth with rounded contours, similar in size to the adjacent benign glands. The cells lack marked nuclear atypia and the nuclei are only 2 to 3 times the size of the adjacent benign nuclei, with exceptionally rare mitoses.171,172 Micropapillary patterns can be seen in both high-grade PIN and IDC-P. The presence of solid nests, dense cribriform architecture, or comedonecrosis essentially rules out the diagnosis of high-grade PIN.161–163 Most importantly, significant nuclear atypia (ie, bizarre pleomorphic nuclei), with nuclei 6 or more times of the adjacent benign nuclei, is only seen in IDC-P.
AIP occasionally exhibits loose cribriform architecture with “low-grade” morphology that overlaps with what in the past was referred to as “cribriform high-grade PIN,” and falls short of the diagnostic criteria for IDC-P.163,168,173,174 The distinction between AIP and cribriform high-grade PIN is particularly problematic as there are no objective, reproducible morphologic criteria to distinguish between them. Based on the survey results, 94% (199 of 211) of the pathologists do not render a diagnosis of “low-grade IDC-P” and the use of this term is not recommended in practice. The majority (122 of 211; 58%) of surveyed pathologists also do not currently use the diagnosis of “cribriform high-grade PIN” on needle biopsy. The GUPS recommendation is to not diagnose cribriform high-grade PIN on biopsy, but rather use the AIP in this setting. Among the authors of this manuscript, a majority also favored replacing “cribriform high-grade PIN” with the more encompassing term “AIP” on radical prostatectomy. However, a minority of the authors preferred to still have the option to diagnose “cribriform high-grade PIN” on radical prostatectomy. The issue is not as critical on radical prostatectomy specimens for patient management, and the majority of cases of AIP on radical prostatectomy will occur in a spectrum with IDC-P, where the latter will be diagnosed.
Use of Immunohistochemistry in Evaluating IDC-P
Of respondents, 84% (177 of 211) reported using immunohistochemistry (IHC) for basal cell markers when a biopsy shows Gleason score 6 cancer and cribriform glands that include a differential diagnosis of IDC-P versus Gleason pattern 4 cancer. A smaller majority (122 of 211; 58%) of GUPS participants would perform IHC for basal cell markers if the biopsy showed Gleason score 7 and cribriform glands, with a differential diagnosis of IDC-P versus Gleason pattern 4 cancer. A larger majority would not perform basal cell IHC on needle biopsy (147 of 211; 70%) or radical prostatectomy (179 of 211; 85%) if the results would not change the Gleason score/Grade Group.
Immunostaining for PTEN and ERG may also be helpful for the diagnosis of IDC-P.170,175–177 Loss of PTEN expression and positive ERG expression is seen in up to 80% and 70%, of IDC-P and AIP cases, respectively, while in high-grade PIN, PTEN expression is typically retained and ERG expression is uncommon (∼18%); the latter is typically associated with ERG+ prostate cancer. Of note, the normal staining pattern for these markers does not exclude the diagnosis of IDC-P and AIP. Only 19% (40 of 211) of surveyed GUPS pathologists use PTEN and/or ERG IHC to provide a more definitive diagnosis if a biopsy shows AIP, when considering IDC-P versus high-grade PIN.
IDC-P Versus Ductal Adenocarcinoma
Other significant carcinomas that should be distinguished from IDC-P include ductal adenocarcinoma.165,178 Ductal adenocarcinoma is a histologic variant of prostatic adenocarcinoma defined by its cytology, characterized by tall, pseudostratified columnar cells, often lining papillary structures with fibrovascular cores. Before the recognition of IDC-P, all ductal adenocarcinomas were uniformly considered invasive carcinoma even if they were surrounded by basal cells. While the majority of intraductal carcinomas show acinar morphology, in occasional cases an entirely intraductal lesion may have the morphology of prostatic ductal adenocarcinoma. There are no data on the incidence of this scenario as IHC has not typically been performed in the setting of ductal adenocarcinoma, but appears to be rare. In such cases, a diagnosis of “IDC-P with ductal morphology” is appropriate (Figure 6, A through D). As with IDC-P in general, the rare case of IDC-P with ductal morphology would not be graded.
Clinical Significance and Management of IDC-P and AIP
The diagnostic criteria and reporting guideline for IDC-P have only recently been established, and its incidence is most likely underestimated in published studies. A recent publication found that IDC-P is more prevalent than previously thought, with an average incidence increasing from 2.1% in low-risk patient cohorts to 23.1%, 36.7%, and 56.0% in moderate-risk, high-risk, and metastatic/recurrent disease risk cohorts, respectively.179 The incidence of IDC-P in prostate needle biopsy varies significantly based on the Gleason grade and tumor volume. It ranges from 2.8% in a prospectively collected biopsy series to 73.9% in cohorts of locally advanced high-risk carcinoma of the prostate.180–183 IDC-P without concomitant prostate carcinoma is, however, exceedingly rare, involving less than 0.06% of prostate biopsy specimens.162,184 IDC-P is also highly prevalent in tumors after androgen deprivation therapy or chemotherapy.185,186
Studies have established that IDC-P represents an independent adverse pathological factor in both radical prostatectomy and needle biopsy specimens, which may influence response to current therapeutic regimens for advanced stage prostate cancer. In radical prostatectomy, IDC-P correlates with other adverse pathological features in the associated invasive prostate cancer, including higher Gleason score, larger tumor volume, and greater probability of extraprostatic extension, seminal vesicle invasion, and pelvic lymph node metastasis, and also independently predicts biochemical recurrence, progression-free survival, and cancer-specific mortality after radical prostatectomy.166,187–194 IDC-P is also an independent predictor of biochemical recurrence in patients receiving neoadjuvant hormone therapy, as well as survival in patients with castration resistant prostate cancer treated with docetaxel.181,186,188,192
In biopsy specimens, IDC-P is typically seen with high-grade, high-volume prostate cancer and is associated with adverse findings in radical prostatectomy and poor outcomes. IDC-P diagnosed in prostate biopsies provided an independent prognostication of early biochemical recurrence, cancer-specific survival, survival in patients with distant metastasis at presentation, and metastatic failure after radiation therapy in intermediate- and high-risk prostate carcinoma.60,183,195,196 Although there is some evidence IDC-P has independent prognostic significance even in high-grade and advanced carcinomas, currently its presence does not affect treatment or prognosis to an extent that warrants the cost and expense of identifying it in these settings. The association with adverse findings in radical prostatectomy is present even when the concomitant invasive prostate cancer is Gleason score 6 or when IDC-P is diagnosed in biopsy without concomitant invasive prostate carcinoma.184,197
Integration of IDC-P into the Gleason system may lead to more accurate predictions of patient outcome after radical prostatectomy.198 Inclusion of IDC-P in prostate biopsies in a preoperative model could potentially improve the prediction of the pathological stage in the radical prostatectomy specimens.182 It also significantly improved the predictive accuracy of a postoperative nomogram that used preoperative clinicopathological variables to predict PSA recurrence after radical prostatectomy.192
Only limited data are available regarding the significance of AIP. One study found the prostate cancer associated with AIP had worse pathological features than prostate cancer associated with high-grade PIN.191 Another study found that AIP-associated prostate carcinoma had similar clinicopathologic features as IDC-P–associated carcinoma.167 When diagnosed on needle biopsy, AIP is potentially considered a marker of unsampled cancer, and it is associated with an increased risk (50%) of invasive carcinoma and/or IDC-P on repeat biopsy.168,170 In addition, for patients with AIP associated with Grade Group 1 and Grade Group 2 carcinoma without cribriform architecture, AIP is a marker of unsampled IDC-P and several other adverse pathological features at radical prostatectomy.168
The appropriate clinical management of patients with IDC-P has not been yet established. IDC-P may be resistant to current therapeutic regimens for aggressive prostate cancer and may require a multimodal approach and novel therapy.179 Active surveillance would not be recommended for the rare cases with Gleason score 3 + 3 = 6 associated with IDC-P on biopsy. A majority (268 of 361; 74%) of clinicians from the GUPS survey would not recommend active surveillance for a man with prostate carcinoma who is otherwise a candidate, if the needle biopsy also showed IDC-P. Some pathology experts recommend definitive therapy for men with IDC-P on needle biopsy, even in the absence of pathologically documented invasive prostate cancer.184 Recently, the NCCN panel recommended patients with IDC-P on biopsy have germline testing for genes involved in Lynch syndrome and DNA repair.27 Cancer predisposition next-generation sequencing panel testing may also be considered in this setting. When IDC-P is reported on needle biopsy in the absence of invasive carcinoma, a slight majority (188 of 361; 52%) of clinicians in the GUPS clinical survey would proceed with definitive therapy (ie, surgery, radiation, etc.), while 40% (145 of 361) would repeat the biopsy to look for invasive carcinoma, and the remaining were unsure how to proceed in this scenario.
Reporting Intraductal Carcinoma of the Prostate
There is a general consensus among pathologists that IDC-P should be reported on both prostate biopsy and radical prostatectomy specimens.5,18,27,199,200 There is, however, a considerable controversy whether IDC-P should be factored into the Gleason score, particularly on prostate biopsy specimens.5,18,165,173,174,177,179,201–204
There are 2 different biological pathways in the development of IDC-P. The majority of IDC-P are associated with invasive adenocarcinoma and the IDC-P is believed to originate via retrograde intraductal spread of the invasive adenocarcinoma. A small subset of IDC-P is not associated with invasive adenocarcinoma and may represent a precursor lesion to prostate cancer.205 In the first setting, patients usually have a high-grade and high-volume invasive adenocarcinoma (Figure 7, A and B). As it is often difficult to differentiate IDC-P from Gleason pattern 4 or 5 without IHC, some urologic pathology experts recommend including IDC-P components as Gleason pattern 4 (cribriform morphology) or 5 (comedonecrosis) in the tumor grade.165,201
However, it would be problematic to grade IDC-P as Gleason Grade 4 or 5 cancer in the small subset of cases that show only a low-grade (Grade Group 1) prostate cancer (Figure 7, C and D) or no evidence of invasive cancer (Figure 8, A through D).184,197 These IDC-P may represent precursor lesions that have significantly better prognostic characteristics than those with IDC-P admixed with invasive high-grade cancer.205 These precursor, isolated IDC-P in radical prostatectomy specimens show different expression patterns of PTEN and ERG from the concurrent low-grade cancer (when present), suggesting isolated IDC-P is not a likely precursor to the associated low-grade invasive prostate cancer, but rather a molecularly unique in situ tumor of unclear clinical significance.206 Also, there are no studies showing that grading IDC-P as invasive carcinoma more accurately correlates with prognosis. For example, in a case of Gleason score 4 + 3 = 7 (Grade Group 3) with IDC-P showing comedonecrosis, if the IDC-P was included in the grade the tumor would be Gleason 4 + 5 = 9 (Grade Group 5); no data support this marked increase in grade in this setting.
One argument that has been proposed for grading IDC-P the same as if it were invasive carcinoma is that historic studies where grade correlated with prognosis, IDC-P, if present, was graded as if invasive carcinoma. However, in these historic studies there would have been only a very small fraction of prostate cancer cases where the highest grade would have changed depending on if IDC-P was graded as invasive carcinoma or not. The prognosis for this small fraction of cases with a grade change would not have any statistically impact on how well the highest grade overall in the study correlated with prognosis. Rather, the overall correlation of grade and prognosis would be driven by the much larger number of cases either lacking IDC-P or where IDC-P was associated with invasive high-grade cancer. By not factoring IDC-P into the grade, future studies can be performed to assess the specific issue of whether grading IDC-P or excluding it from the grade better correlates with prognosis.
GUPS recommends it is not necessary to perform basal cell IHC on needle biopsy and radical prostatectomy to identify IDC-P if the results of the stains would not change the overall highest Gleason score/Grade Group for the case. This is the situation in the vast majority of cases, as IDC-P is typically associated with overt infiltrating high-grade cancer. In the setting of overt invasive high-grade carcinoma, it is reasonable to grade glands that have the differential diagnosis of high-grade carcinoma versus IDC-P as all invasive carcinoma. Conversely, GUPS does recommend to perform IHC for basal cell markers when the biopsy shows either no definite invasive carcinoma or Gleason score 6 cancer along with cribriform glands that include a differential diagnosis of IDC-P versus Gleason pattern 4 cancer.
Only approximately one quarter (49 of 211; 23.2%) of GUPS survey participants would include IDC-P in determining the final Gleason score, either both on needle biopsy and prostatectomy (35 of 211; 16.6%), only on needle biopsy (8 of 211; 3.8%), or only on radical prostatectomy (6 of 211; 2.8%). When IDC-P is identified on prostate biopsy without concomitant invasive adenocarcinoma, 89.1% (188 of 211) of surveyed GUPS pathologists would add a comment stating that IDC is usually associated with high-grade prostate cancer.
Although definitive therapy may be indicated in most cases, some pathologists recommend immediate repeat biopsy instead of definitive therapy in this scenario. This approach was also subsequently adopted by the WHO in 2016.18
WORKING GROUP 7: MOLECULAR TESTING
A summary of recommendations on molecular testing is seen in Table 9.
Molecular Assays Useful in the Context of Low-Risk (Low-Grade) Disease
Ki67
The Ki67 antigen (detected by the MIB1 antibody) shows a strong correlation with Gleason score and more aggressive disease.207,208 Numerous radical prostatectomy-based studies confirm the independent prognostic value of Ki67 IHC in predicting biochemical recurrence, progression, overall, and disease-specific survival beyond Gleason score, pathological stage, and PSA levels.209–213 In needle biopsies, Ki67-labeling index independently predicts tumor-specific survival, recurrence, and progression, thus suggesting clinical utility of a Ki67 IHC score in the context of active surveillance.214–222 Only a few reports have failed to demonstrate prognostic utility of Ki67 in biopsies, likely because of sampling issues.223,224 Despite Ki67 being among the best validated prognostic biomarkers that has been in use since 1989, it is not ready for use in clinical practice because of the high level of variability in Ki67 scores in various cohorts (ranging from 2.1%–28%) and the difficulty in defining a high-labeling index.212,213 These limitations likely reflect differences in patient cohorts, tumor heterogeneity, pre- and postanalytic variables, sampling size, manual versus automated scoring, and different statistical methodologies in determining “suitable cutpoints.”211,225
PTEN
Among the genomic alterations common in localized prostate cancer, PTEN gene deletions measured by either IHC or fluorescence in situ hybridization are among the most tightly associated with poor outcomes. The frequency of PTEN loss by IHC is closely associated with Grade Groups, and occurs in approximately 20% of localized prostate tumors in most surgical cohorts. In radical prostatectomy samples, PTEN loss is associated with an increased risk of biochemical recurrence, metastasis, and lethal prostate cancer, independent of Grade Group and stage in most cohorts.226–231 Similarly, in needle biopsies, PTEN loss is independently associated with outcomes in patients treated with radiation therapy.232,233 In Grade Group 1 biopsies, PTEN loss is associated with an increased risk of upgrading to Grade Group 2 or higher, both in active surveillance cohorts and surgical cohorts.234–236 In Grade Group 2 biopsies, PTEN loss is associated with risk of nonorgan-confined disease at radical prostatectomy.237 Finally, there is evidence that PTEN loss is associated with metastasis and prostate cancer-specific mortality when present in needle biopsies of surgically treated patients.238 Despite considerable progress, additional studies of active surveillance cohorts are needed to fully establish the utility of PTEN loss in this clinical setting. In addition, evaluation of PTEN loss in core needle biopsy is significantly affected by inter- and intratumor core staining heterogeneity.
RNA-based prognostic assays
The NCCN Prostate Cancer Guidelines have recently included Myriad Genetics' Prolaris, GenomeDx's Decipher, and Genomic Health's Oncotype DX Prostate Cancer as options to assist with risk stratification in men with low risk or favorable intermediate risk of newly diagnosed prostate cancer.27 The Prolaris assay uses quantitative reverse-transcription polymerase chain reaction analysis to measure the mRNA expression levels of changes in 31 cell cycle progression genes (primarily proliferation based) providing an independent molecular assessment of disease progression. The Prolaris assay report categorizes tumors as less aggressive, more aggressive, or consistent with the average risk of the relevant NCCN based clinical risk group, and provides an estimate of a patient's 10-year prostate cancer–specific mortality. Multiple studies have demonstrated that Prolaris provides information beyond standard clinical variables in determining prostate cancer specific mortality and progression to metastatic disease, particularly in the setting of needle biopsies.239–241 However, studies of Prolaris in active surveillance populations have not been completed at this time.
OncotypeDx Genomic Prostate Score is also a quantitative reverse-transcription polymerase chain reaction analysis–based assay that measures expression levels across 12 genes involved in stromal response, androgen signaling, cellular organization, and proliferation, with an additional 5 reference genes.242 OncotypeDx Genomic Prostate Score has predominantly been studied in low-risk and favorable intermediate-risk biopsy samples where it is associated with increased risk of adverse pathology at radical prostatectomy and improves the concordance index of the Cancer of the Prostate Risk Assessment (CAPRA) clinical-pathologic risk classifier.243 OncotypeDx Genomic Prostate Score has been tested in the setting of active surveillance, where it was associated with a higher risk of adverse pathology and biochemical recurrence in patients who underwent subsequent radical prostatectomy.244
The Decipher assay from GenomeDx Biosciences quantifies the expression levels of 22 transcripts of genes involved in cell adhesion and migration, cell cycle control, immune system modulation, cellular differentiation, and androgen signaling in formalin-fixed tumor tissue. The assay yields a single value, on a range of 0 (poor prognosis) to 1 (better prognosis).245 In practice, tumors are categorized into 3 risk groups—low, intermediate, and high. The assay has been reported to perform better as a prognostic metric than do traditional parameters (ie, serum PSA, tumor grade, and tumor stage). This has been documented for postprostatectomy clinical scenarios of medium and high-risk cancer, including biochemical recurrence and response to adjuvant radiation therapy.246–248 The assay has been used on biopsies, both to predict metastases after radical prostatectomy and as a prognostic parameter for low-risk patients who are candidates for active surveillance.249,250 Finally, providing the Decipher score to patients is associated with less patient anxiety regarding decisions about receiving adjuvant radiation therapy.251
There are some clinically relevant questions about Decipher and other molecular RNA assays used on fixed tissue. Does the study design limit the reported performance to selected populations?252 Is the sample representative of molecularly heterogeneous tumors?253 To what extent does contamination of the sample by nonneoplastic cells contribute to variability in assay performance? What effect do preanalytic variables have on assay results?254
Though clinical use has been high for many of the RNA-based assays, often driven by direct marketing to urology practices, dedicated studies in active surveillance populations, which incorporate evaluation of needle biopsy information such as percent pattern 4 and the presence of cribriform architecture, are clearly needed to substantiate the utility of these expensive tests in this setting.
Molecular Assays Useful in the Context of High-Risk (High-Grade) Disease
In recent years, it has become clear that mutations in DNA repair pathway genes are enriched in high-risk prostate cancer. Alterations in the homologous repair pathway, which includes mutations in the BRCA2, BRCA1, and ATM genes, are seen in nearly 20% of cases of advanced castration-resistant prostate cancer.255 Importantly, nearly half of these metastatic cases with homologous recombination defects have germline mutations, comprising close to 10% of men with castration-resistant prostate cancer.255,256 The dramatic enrichment of germline homologous recombination defect alterations in metastatic compared with primary prostate cancer supports numerous studies that suggest that these alterations are associated with the development of aggressive disease.257–260 Indeed, germline alterations in BRCA2 and ATM are significantly more common in lethal compared with indolent primary prostate cancer and are associated with grade reclassification in active surveillance cohorts.261,262 Moreover, in aggressive histologic subsets of primary prostate cancer (including ductal and intraductal prostate cancer and Grade Group 5 tumors) the prevalence of homologous recombination defect mutations may approach that seen in metastatic disease.206,263–266 Similarly, defects in genes involved in the mismatch repair (MMR) pathway are seen in up to 10% of castration-resistant prostate cancer cases, compared with less than 3% of primary tumors.267–269 MMR mutations may also be enriched among primary tumors with aggressive histology, including both ductal and Grade Group 5 tumors.237,266,270 Relative to homologous recombination defect alterations, fewer of the MMR mutations in prostate cancer are germline, though prostate cancer is enriched among patients with Lynch syndrome.271,272
These results have important implications for genetic counseling of patients and therapeutic decision-making. Currently, the NCCN guidelines recommend germline testing in high-risk patients with clinically localized prostate cancer, including all patients with Grade Group 4 or 5 tumors or patients in any risk group with intraductal carcinoma on biopsy.27 Given promising results in homologous recombination defected prostate cancer with poly-ADP ribosylase inhibitors, such as olaparib and the recent Food and Drug Administration approval of pembrolizumab, for all progressing tumors with MMR defects, consideration should be given for offering somatic genomic sequencing for men with metastatic prostate cancer.267,273
Utilization of Prostatic Molecular Tests in Clinical Practice Based on Survey Results
GUPS members were surveyed on the question “In which situations do you perform/request molecular tests (PTEN, commercial assays, etc.) in men with Gleason 6 (Grade Group 1) who are candidates for active surveillance?” The majority (137 of 211; 65%) did so only when requested by a clinician or by a combination of clinical request and other variables (11 of 211; 5%), such as young age, multiple positive cores, and high cancer percent involvement. For 5% (11 of 211) of respondents, tests were done without a clinical request depending on these other variables. Of respondents, 22% (46 of 211) never performed or requested molecular tests. When clinicians were asked the same question, 47.5% (171 of 360) responded that they never request molecular testing in these men. Of clinicians who do order the test, the most common reasons were in young men (171 of 360; 41%) and patient preference (128 of 360; 35.5%).
GUPS members were also surveyed on “In which situations do you perform/request molecular tests (PTEN, commercial assays, etc.) in men with Gleason score 3 + 4 = 7 (Grade Group 2) who are candidates for active surveillance?” The majority (149 of 211; 70%) did so only when requested by a clinician or by a combination of clinical request and other variables (6 of 211; 3%), such as older age, few positive cores, and low cancer percent involvement. For 5% (10 of 211) of respondents, tests were done without a clinical request depending on these other variables. Of respondents, 22% (46 of 211) never performed or requested molecular tests. When clinicians were asked the same question, 54% (196 of 361) responded they never request molecular testing in these men. Of clinicians who do order the test, the most common reason was patient preference (118 of 361; 32.7%).
WORKING GROUP 8: DIGITAL PATHOLOGY/ARTIFICIAL INTELLIGENCE AND NOVEL GRADING APPROACHES
A summary of recommendations on digital pathology/artificial intelligence and novel grading approaches in seen in Table 10.
Digital Pathology/Artificial Intelligence
With substantial advancements in technology and infrastructure surrounding computing power and storage of large image data sets, digital pathology with the more recent advent of artificial intelligence tools is on the cusp of making a significant impact on the practice of diagnostic pathology.274 Digital whole slide imaging captures an entire tissue sample on a slide converting it into millions of pixels, which can be shared easily as a virtual slide or subjected to pattern-recognition learning algorithms using specialized digital pathology software.275–279 Applications of artificial intelligence and machine learning techniques, such as deep neural networks, may be trained to not only recognize specific patterns on a whole slide image of an hematoxylin-eosin (H&E) slide but in addition artificial intelligence tools may also help in the interpretation of features in the tissue that are predictive and/or prognostic.274,280,281
A noteworthy body of work has occurred recently in applying novel imaging technologies, computational pathology and artificial intelligence tools for the detection, grading, and prognostication of prostate cancer.275,277,278,282 While this GUPS position paper is primarily focused on grading prostate cancer, machine learning–based grading will benefit if artificial intelligence can assist in recognizing prostate cancer. This application of computational pathology with respect to prostate cancer is relatively new, and we will make some preliminary comments on prostate cancer recognition and provide an update on grading through artificial intelligence algorithms. Gorelick et al283 demonstrated the use of machine learning methods on 50 whole mount H&E slides from 15 patients with prostate cancer. The results were encouraging and demonstrated accuracies of 90% in prostate cancer detection and up to 85% in high Gleason grade versus low Gleason classification tasks. Somanchi et al282 also accurately identified cancer on digital H&E slides of prostate slides. Esteban et al284 used whole slide images of prostate cancer and employed machine learning techniques to aid in prostate cancer detection. Another new technology that has recently been described to detect prostate cancer has been the use of an augmented reality microscope by Chen et al.285 The augmented reality microscope device enables the pathologist to review a case under the microscope, but in addition to the review of the slides, artificial intelligence algorithms are concurrently run at the time the of review to dynamically provide computer-assisted guidance to the pathologist. The ability to integrate real-time artificial intelligence algorithms using the live image feed of prostate cancer slides on the stage is potentially a significant improvement. Wang et al286 described the use of 4 different machine learning algorithms, including the use of artificial intelligence networks for the prediction of prostate cancer with the goal of reducing unnecessary biopsies. An important aspect of this work was to include any available prebiopsy information including PSA as inputs in the model construction.
Campanella et al287 described the use of a weakly supervised deep learning method on 24 859 slides of prostate needle core biopsies.287 The advantage of this method is that it was possible to use slide-level diagnosis obtained from the electronic medical records to train the classification model in a weakly supervised manner. This type of approach saves the time and the effort needed by expert pathologists to manually annotate every single prostate core biopsy.
Using computer-assisted methods, it is now possible to objectively grade prostate cancer in histopathological slides in order to improve accuracy and reproducibility. This is an area of significant interest by several groups and several recent computational pathology and artificial intelligence studies have addressed Gleason grading of prostate cancer.280,288,289 Nir et al288 concluded that as newer artificial intelligence tools are developed for the grading of prostate cancer and other cancer types, it is critical to use annotation data by multiple experts for training and validation of these pattern-recognition algorithms.
A recent study by Lucas et al289 attempted to create an automatic Gleason pattern classification method that could potentially assist in Grade Group determination of prostate needle biopsies. A convolutional neural network, a deep learning method, was used on 96 prostate biopsies from 38 patients. The results were striking and demonstrated that the algorithm when discriminating between benign and malignant (Gleason pattern ≥3) areas resulted in an accuracy of 92% with a sensitivity and specificity of 90% and 93%, respectively. In addition, the algorithm when differentiating between Gleason patterns 4 or more and Gleason patterns 3 or less achieved an accuracy of 90%, with a sensitivity and specificity of 77% and 94%, respectively. The algorithm performance was compared with the grading done by an expert urologic pathologist was 65% (kappa = 0.70).
More investigators are describing newer artificial intelligence methods for grading of prostate cancer. A recent study by Nagpal et al280 described a deep learning model to improve Gleason scoring of prostate cancer from prostatectomies. A massive data set of 112 million pathologist-annotated image patches from 1226 slides were used. The deep learning technique achieved an accuracy rate of 0.70 as compared with 0.65 accuracy achieved by 29 general pathologists.
GUPS members were surveyed on “Do you or your group use any deep learning or artificial intelligence tools to aid with prostate cancer grading?” Of 211 respondents, 96% (202 of 211) said no and 4% (9 of 211) said yes. Many laboratories and practices are exploring the use of digital pathology and some are “slideless.” Commercially available tools for prostate cancer detection and grading are starting to come to market and it is expected that in the next 3 to 5 years there will be a number of robust clinical grade products to augment the diagnosis of prostate cancer and to help create more “automated” and “standardized” approaches to grading prostate cancer. Urologic pathologists may also benefit from future computational pathology methods and tools of pattern recognition where the computer will continue to learn from the expertise of urologic pathologists to build “feedback” loops for those pathologists with less urological pathology expertise. In summary, while there is active research and promise in this area, the findings are premature for recommendations of digital pathology/artificial intelligence in routine clinical application. Further scholarship in this field is required in terms of value, accuracy, and efficiency before best practices can be elucidated.
Novel Grading Approaches
Incorporating Reactive Stroma Grade
Host stromal reaction is highly variable in prostate cancer with most cases having little to no reactive stroma. A grading system to quantitate reactive stroma was initially proposed by the Baylor College group in a large patient cohort using the trichrome stain.290 Tumors with 0% to 5% stroma were assigned a Reactive Stroma Grade (RSG) of 0. Tumors with 5% to 15% were assigned RSG 1 and those with 15% to 50%, RSG 2. Tumors with greater than 50% reactive stroma were labeled RSG 3. Patients with RSG 3 had a significantly worse cancer specific survival compared with those with RSG 1 or 2. The same authors subsequently converted their RSG system to a binary grading system that is applicable to routine H&E biopsies and can be integrated in proposed nomograms tools. RSG3 grade is designated as stromogenic carcinoma while RSG0-2 tumors are categorized as nonstromogenic. Presence of stromogenic carcinoma in needle biopsies was an independent predictor of recurrence in Gleason score 3 + 4 = 7 and 4 + 3 = 7.291 In another study, patients with higher percentage of stromogenic carcinoma on radical prostatectomy were shown to have decreased biochemical-free and cancer-specific survival independent of Gleason grade.292 The prognostic role of reactive stromal pattern in Gleason score 6 and Gleason score 7 has been validated in several independent cohorts.140,293–295 Consideration should be given to incorporation of stromogenic assessment in future modifications of ISUP 2014 grading system.296
Incorporating Percent Gleason Pattern 4
Sauter et al28 analyzed radical prostatectomy specimens from 12 823 consecutive patients and 2971 matched preoperative biopsies for which clinical data with an annual follow-up between 2005 and 2014 were available. In prostatectomy specimens, there was a continuous increase of the risk of prostate-specific antigen recurrence with increasing percentage of Gleason pattern 4 fractions with remarkably small differences in outcome at clinically important thresholds (0% versus 5%; 40% versus 60% Gleason 4), distinguishing traditionally established prognostic groups. They proposed a quantitative Gleason scoring composed of the following 13 grades: 6 or less; 5% pattern 4; 6% to 10% pattern 4; 11% to 20% pattern 4; 21% to 30% pattern 4; 31% to 49% pattern 4; 3 + 4 = 7 with tertiary pattern 5; 50% to 60% pattern 4; 61% to 80% pattern 4; more than 60% pattern 4; 4 + 3 = 7 with tertiary pattern 5; 8; and 9 to 10. Quantitative grading may also reduce the clinical impact of interobserver variability because borderline findings, such as tumors with 5%, 40%, or 60% Gleason 4 fractions and very small Gleason pattern 5 fractions are not a factor in the proposed system. This grading system preceded the recommendation to report percent pattern 4 on Grade Groups 2 and 3 and to some extent replicates the current practice in regard to factoring in Gleason pattern 4. Quantitative Gleason scoring would not be applicable to biopsies which lack a tertiary pattern 5.
Incorporating Tertiary Gleason 5 Patterns
Sauter et al297 built on their prior study on percent Gleason pattern 4 to factor in tertiary Gleason pattern 5. Prostatectomy specimens from 13 261 consecutive patients and of 3295 matched preoperative biopsies were studied. Percentages of Gleason patterns 3, 4, and 5 were recorded for each cancer. The IQ-Gleason combines all Gleason pattern data of a prostate cancer in 1 continuous numeric value. It ranges from 0 to 117.5 and is calculated as follows: percentage of unfavorable Gleason patterns 4 and 5, + 10 score points if any Gleason 5 pattern was seen + another 7.5 score points in case of Gleason 5 quantities of more than 20%. For example, the IQ-Gleason of a Gleason 3 + 4 = 7 cancer with 40% Gleason 4 is 40, the IQ-Gleason of a Gleason 3 + 4 = 7/tertiary grade 5 cancer with 40% Gleason 4 and 5% Gleason 5 is 40 + 5 + 10 = 55. This represents the basis value (percentage of unfavorable Gleason pattern = 45) plus 10 score points because there is Gleason pattern 5. The IQ-Gleason of a (Gleason 4 + 5 = 9) cancer with 60% Gleason 4 and 40% Gleason 5 is 60 + 40 + 10 + 7.5 = 117.5. They found a continuous increase of the risk of prostate-specific antigen recurrence with increasing IQ-Gleason. This was also true for the subgroups with identical Cancer of the Prostate Risk Assessment Postsurgical scores or Grade Groups.
The main criticisms of the IQ-Gleason are that it is too complicated for application in routine practice and its scores are also not intuitive for clinicians or patients. The methodology of accurately and reproducibly determining various patterns, and in particular pattern 5, also remains challenging. In addition, calculation of the Gleason score was based on total tumor in the radical prostatectomy and not the index tumor, as currently recommended. These deviations from the ISUP 2014 grading recommendations could result in inaccurate Grade Groups compared with IQ-Gleason. However, the biggest criticism is the new system does not add to their older grading system (quantitative Gleason grade) factoring in the percent of Gleason pattern 4.
Incorporating Cribriform/Intraductal Carcinoma
Van Leenders et al298 most recently proposed a modification of the ISUP 2014 grading system based on factoring in IDC-P and cribriform carcinoma. All prostate cancer biopsies of 1031 men from the European Randomized Study of Screening for Prostate Cancer between 1993 to 2000 were reviewed for Grade Groups using current grading criteria. Invasive cribriform structures were defined as “small and expansive malignant epithelial proliferations with intercellular lumina in which the majority of tumor cells did not contact surrounding stroma, and which spanned at least half of the glandular lumen.” The authors designated their modified grades as “cGrade” to denote the factoring in of cribriform glands. Cribriform glands (as defined above) and IDC-P were considered together as “CR/IDC.” For prostate cancer patients with Grade Groups 2 to 5, the cGrade is similar to the Grade Group if CD/IDC-P is present. The cGrade is equal to the Grade Group minus 1, if CR/IDC is absent. For men with Grade Group 1, the cGrade is 1. In the rare case of Grade Group 1 prostate cancer with concomitant IDC-P, the assigned cGrade is 2. The cGrade had better discriminative value than Grade Groups for disease-specific survival, metastasis-free survival, and biochemical recurrence-free survival after radical prostatectomy and radiation therapy. In their study, 256 of 310 men with biopsy Grade Group 2 were reclassified as cGrade1. The number of men fulfilling the Prostate Cancer Research International Active Surveillance study criteria for active surveillance increased from 26.4% to 34.5% if cGrade1 was used instead of Grade Group 1. The authors concluded that reclassification of Grade Group 2 prostate cancer patients as cGrade1 might allow more patients to be considered for active surveillance.
However, there are some significant limitations to this study. Patients underwent only 6 core biopsies (sextant), which is very different from current clinical practice, such that their data may not be applicable to contemporary prostate cancer grading. Men with Grade Group 1 (Gleason score 3 + 3 = 6) sampled by sextant biopsy may have significant undersampling and undergrading.44 Consequently, their baseline comparison of this proposed grading system to the Gleason system is flawed, as the Gleason grades they compare with are not accurately compared with contemporary sampling. Another significant and limiting point is that the study is small, considering the authors are proposing to change the existing grading system. The patients in this study were also treated with different treatment modalities and only 406 men underwent radical prostatectomy. In contrast, the initial study describing Grade Groups included more than 7500 men with biopsies treated by radical prostatectomy, followed by the validating multi-institutional study of more than 20 000 men.6,9 In their study, van Leenders et al298 also did not distinguish between IDC-P and cribriform carcinoma. The authors defined invasive cribriform structures as cribriform structures that spanned at least half of a glandular lumen, which is not one of the more commonly used definitions for large cribriform glands and is unclear what it even means. By combining cribriform cancer and IDC-P into 1 group, the authors cannot evaluate whether concurrent IDC-P in cases with cribriform carcinoma drove the adverse prognosis. The authors claim the most prominent effect of their cribriform Grade model over the current Grade Group model is the expansion of the number of men potentially being eligible for active surveillance. Cancers currently graded as Gleason score 3 + 4 = 7 (Grade Group 2) that lack cribriform glands and IDC-P would be graded as Grade Group 1 and would be considered for active surveillance. The authors base this recommendation on their follow-up of men who were almost all treated for their cancer. For example, there are no studies on following men with no treatment with Grade Group 2 cancer containing up to 50% poorly formed glands.
The authors correctly call their study a “proof-of-principle” study. Multi-institutional works with a much larger number of patients, as well as long-term follow-up of men on active surveillance with Gleason score 3 + 4 = 7 without cribriform glands, are needed before such a major change in prostate cancer grading is adopted.
CONCLUSIONS
Herein, we present a position paper detailing the first prostate cancer grading recommendations from the GUPS addressing key areas, including recording extent of pattern 4, grading of cribriform cancer, core-level versus global grading, especially in the setting of multiparametric MRI-targeted biopsies, reporting of IDC-P, and minor tertiary pattern 5. With an eye toward the future, we have also provided updates on molecular pathology and digital pathology/artificial intelligence and their intersection with prostate cancer grading. Our understanding of the molecular underpinnings of prostate cancer in the context of grade of the disease and as a complement to grading continues to evolve and demonstrate promise. Artificial intelligence has the opportunity to be both disruptive and if incorporated appropriately, empowering to surgical pathologists with greater impact on patient care. We hope our recommendations on contemporary prostate cancer grading will be the basis of more standardized reporting, while stimulating new avenues for research.
The authors thank all the GUPS members and clinicians who filled out the survey that helped formulate the recommendations of this GUPS position paper.