Appropriate patient management requires precise and meaningful tumor classification. Breast cancer classification continues to evolve from traditional morphologic evaluation to more sophisticated systems with the integration of new knowledge from research being translated into practice. Breast cancer is heterogeneous at the molecular level, with diversified patterns of gene expression, which is presumably responsible for the difference in tumor behavior and prognosis. Since the beginning of this century, new molecular technology has been gradually applied to breast cancer research on issues pertinent to prognosis (prognostic signature) and therapeutic prediction (predictive signature), and much progress has been made.
To summarize the current state and the prospective future of molecular classification of breast cancer.
Sources include recent medical literature on molecular classification of breast cancer.
Identification of intrinsic tumor subtypes has set a foundation for refining the breast cancer molecular classification. Studies have explored the genetic features within the intrinsic cancer subtypes and have identified novel molecular targets that led to the innovation of clinical assays to predict a patient's prognosis and to provide specific guidelines for therapeutic decisions. With the development and implication of these molecular tools, we have remarkably advanced our knowledge and enhanced our power to provide optimal management to patients. However, challenges still exist. Besides accurate prediction of prognosis, we are still in urgent need of more molecular predictors for tumor response to therapeutic regimes. Further exploration along this path will be critical for improving a patient's prognosis.
Breast cancer is a heterogeneous group of malignant tumors with diversified morphology, biologic behavior, clinical course and prognosis, and accurate tumor classification is critical for a patient's care. Theoretically, a valuable classification system has to fulfill at least 1 of 2 roles: to predict tumor behavior and a patient's prognosis, and to predict tumor response to available therapeutical regimes to help clinicians select the most optimal treatment plan. The current World Health Organization (WHO) tumor classification1 is based primarily on morphologic evaluation and divides invasive breast carcinoma into 2 major categories: no special type (NST) and special types. Although the WHO classification is quite sophisticated, it has limitations. First, the NST category is for tumors that cannot be further classified. NST comprises approximately 75% of breast cancer cases, but about 60% of NSTs fall into moderately differentiated carcinomas (Nottingham Grade 2). This is clearly a heterogeneous group, although a ductal differentiation is assumed. The remaining 25% of breast cancers are assigned to 19 subgroups of special breast cancers with distinctive morphologic features and better-defined prognoses. It is obvious that histomorphologic classification alone is not sufficient to satisfy the need for individualized patient care, and this opens the door for molecular classification to be explored.
INTRINSIC MOLECULAR CLASSIFICATION
The pioneer molecular classification was pursued by Perou et al2 in the beginning of this century. Using complementary DNA microarrays representing 8102 human genes, they first characterized a set of 65 surgical specimens of breast tumors from 42 individuals, and found that the tumors could be classified into subtypes distinguished by pervasive differences in their gene expression profiling (GEP). With further studies and refinement,3,4 the authors proposed a classification scheme that divided breast cancer into 4 intrinsic molecular subtypes: luminal A, luminal B, v-erb-b2 (ERBB2)/human epithelial growth factor receptor 2 (HER2) gene-overexpressing (HER2+), and basal-like. The luminal carcinomas characteristically express estrogen receptor (ER) with variable cell proliferations. HER2 overexpression is the hallmark of ERBB2-overexpressing tumors that also lack ER and progesterone receptor (PR) expression. Basal-like carcinoma fails to express ER, PR, or HER2 (triple-negative carcinoma; TNBC), instead expressing basal cell markers, such as cytokeratin (CK) 5/6 and/or epidermal growth factor receptor (EGFR). These subtypes demonstrated distinct histologic patterns, clinical features, and prognosis. The development seemed quite exciting. However, adoption of the GEP test by general pathology laboratories turned out to be difficult because of its technical complexity and cost inefficacy. Therefore, alternative ways were sought to simulate the GEP results. Cheang et al5–7 identified a novel immunohistochemical (IHC) panel, including 6 IHC markers, and found it could recapitulate the biologic subgroups of breast cancer derived from full GEP. Schnitt8 later summarized the IHC diagnosing criteria of the intrinsic classification as follows: (1) luminal A: ER+ and/or PR+, HER2−, and low Ki-67 (<14%); (2) luminal B: ER+ and/or PR+, HER2+ or HER2−, and high Ki-67 (>14%); (3) HER2+: ER−, PR−, and HER2+; and (4) basal-like (BLBC): ER−, PR−, HER2− (triple negative), plus CK 5/6+, and/or EGFR+. The criteria were adopted by the 2013 European St Gallen Consensus with minor modifications by increasing Ki-67 to 20% or more and decreasing PR to 20% or less for better separations.9
The intrinsic molecular subtyping has set a landmark in the development of breast cancer classification. However, it is not perfect. First, it only classifies breast cancer into 4 sustainable types, which is obviously oversimplified in reflecting the molecular complexity of the underlying tumor. In fact, each type is still heterogeneous with variable prognosis and diversified treatment responses. Second, it could be surrogated in large part by IHC. Therefore, its application does not gain popularity in routine practice. However, it does form a foundation for the further exploration of prognostic and predictive molecular assays. Following the identification of intrinsic molecular subtyping, assays have been developed and are currently more mature in the luminal cancers in guiding a patient's clinical management.
LUMINAL BREAST CANCERS
The luminal type of breast cancer comprises approximately 60% of breast cancer, hallmarked by the expression of ER, although with more uniformity in type A than in type B. Clinically they tend to present as an early stage of cancer. Multiple assays have been developed specifically for this subgroup to provide guidance for the most optimal clinical management of individual patients. So far this is the group of tumors where molecular tools have been most successful in predicting tumor biologic behaviors and therapeutic responses. Clear management guidance has been provided for individual patients. Following is a brief description of the assays that have gained US Food and Drug Administration (FDA) approval.
21-Gene Recurrence Score
This is a reverse transcription–polymerase chain reaction (RT-PCR) assay on 16 genes associated with proliferation, tumor invasion, hormonal receptor expressions, and HER2 gene expression, plus 5 reference genes.10,11 It is designed to answer 2 questions. First, how likely would a patient with ER+ HER2− early-stage breast cancer with 3 or fewer lymph node metastases experience recurrence after endocrine therapy, and second, whether a patient will benefit from adjuvant chemotherapy, taking side effects into consideration. The assay provides a numeric recurrence score, along with predictions on the risk of distant recurrence at 9 years, and the group average absolute chemotherapy benefit. The scores divide patients into low-, intermediate-, or high-risk groups based on the patient's age and provide specific guidance for further individualized management. It is reported based on the lymph node status (negative versus positive) and provides a quantitative single gene expression score of ER, PR, and HER2. This assay has been validated by the National Surgical Adjuvant Breast and Bowel Project B-20 and Trial Assigning Individualized Options for Treatment (TAILORx) clinical trials, is commercially known as Oncotype DX, and has been widely used clinically.12,13
Prosigna Gene Signature
Prosigna Gene Signature, also called prediction analysis of microarray of 50 genes (PAM50), is designed for postmenopausal women within 10 years of early-stage ER+ breast cancer with up to 3 positive lymph nodes. It was designed to predict the risk of tumor distant metastasis after 5 years of standard postoperative hormonal therapy and the benefit of hormonal therapy beyond 5 years.14,15 It generates the Prosigna risk of recurrence scores that divide patients into low-, intermediate-, or high-risk groups based on the nodal status. It was recently reported to be independently prognostic for long-term breast cancer survival, irrespective of menopausal status.16 Compared with Oncotype DX, fewer patients fall into the intermediate group, therefore offering a better risk stratification.17
MammaPrint
MammaPrint is a prognostic signature test by microarray that analyzes the activity of 70 genes in early-stage breast cancer.18 It is designed to predict the tumor recurrence/metastasis risk within 5 to 10 years postoperatively in patients with stage I or II, ER+ or ER−, and HER2− breast cancer who had tumor size less than 5 cm with up to 3 positive lymph nodes. It divides patients into high risk and low risk based on the recurrence score. The high-risk patients may benefit from adjuvant chemotherapy or other treatment modalities. This assay has been validated in multiple studies and by the Microarray in Node-negative Disease may Avoid Chemotherapy trial (MINDACT).19–22
EndoPredict
The EndoPredict test analyzes the activity of 12 genes in breast cancer cells for patients with early-stage, ER+ HER2− breast cancer with up to 3 positive lymph nodes. It is designed to predict the risk of distant metastasis within 10 years after diagnosis.23,24 Test results are given as an EPclin Risk Score, which is categorized as either low risk or high risk. Combined with other clinical information, such as the tumor grade, percentage of ER expression, and patient's age, a more informed decision about chemotherapy can be made.25,26
In addition to the above FDA-approved assays, there are a few other available genetic tests for luminal-type cancer, such as the Breast Cancer Index.27,28 It analyzes the activity of 7 cancer-related genes to predict the risk of cancer recurrence within 5 to 10 years after diagnosis (late recurrence), and the likely benefit of extending hormonal therapy beyond the initial 5 years. The assay currently covers the activity of 11 genes in order to predict the chance of late recurrence (5–10 years after diagnosis) of early-stage, HR+ breast cancer, and the benefit of extended hormonal therapy.29 Although not FDA approved yet, it has been recommended in the recent National Comprehensive Cancer Network guidelines.30–32
Although all of the above assays provide estimates of the risk of distant recurrence for patients receiving endocrine therapy, discordance in estimates occurs among them, and different molecular influences on these assays are underexplored. A recent study on molecular drivers found that recurrence scores are determined strongly by ER-related features and only weakly by proliferation markers. However, PAM50 and EndoPredict are determined by proliferative features. These relationships may explain the differences in prognostic performance of these tests.33 Features of the assays are summarized in the Table.
Another gene expression signature worthy of mention is the Genomic Grade Index (Gene Expression Grade Index),34 which was developed from analyzing the activity of 97 tumor genes in ER+ breast cancers by RT-PCR. The index divides tumors into low-risk and high-risk subsets, correlating results with the tumor histologic grades. The low- and high-risk groups correspond to tumor histologic grades 1 and 3, respectively. Notably, there is no group in this classification that correlates to intermediate histologic grade of tumor, which is the most prevalent histologic grade of breast cancer. However, there is heterogeneous biologic behavior and treatment response in this subgroup. Using the gene signature, Metzger-Filho et al35 successfully reclassified 54 histologic grade 2 tumors into grade 1 (54%) and grade 3 (46%). This differentiation impacted the treatment decision of the grade 3 group and overall increased the use of chemotherapy. Although it seems a promising assay for practical application, there is no validation for clinical recommendation and it has not yet gained FDA approval.
HER2+ BREAST CANCER
This subtype of tumors is the purest group in terms of the molecular changes.36 It is diagnosed by immunohistochemistry revealing HER2 overexpression, or by fluorescence in situ hybridization (FISH) demonstrating HER2 gene amplification. Most tumors present with high histologic grade and behave aggressively. However, the tumors are sensitive to conventional chemotherapy and respond well to humanized monoclonal antagonist trastuzumab (Herceptin), with pathologic complete response (pCR) in 23% to 40% of the cases, extending the event-free survival from 43% to 58% in some studies.37–40 On the other hand, a small portion of tumors are not sensitive to Herceptin therapy. Molecular studies have identified markers that predict Herceptin resistance, such as PTEN loss in PIK3CK pathway activation,41 HER2Delta16 expression,42 p95HER2 with loss of Herceptin binding site,43 IGF-1R overexpression,44 and MUC4 overexpression,45 etc. Further study to validate the biomarkers predicting Herceptin resistance would facilitate the optimal management of patients with this biologically small but unique cancer subtype.
BLBC/TNBC
TNBC is identified by surrogate immunohistochemistry with negative reactions to ER, PR, and HER2. It may or may not express CK 5/6 and/or EGFR at the protein level. Therefore, it is similar to but not identical to BLBC, which is identified by GEP. Research indicates that the 2 entities have approximately 75% concordance.46 However, most TNBCs and BLBCs are morphologically similar and biologically aggressive, although groups of low-grade BLBC/TNBC have been recognized with much better prognosis.47 Because of the close association, these 2 entities are lumped together for discussion.
TNBCs/BLBCs are responsible for 30% of breast cancer deaths, although they account for only 10% to 20% of breast cancers. Most high-grade TNBCs/BLBCs are of NST histology, with a small proportion being metaplastic carcinomas.48 Because of the lack of targetable biomarkers, there are no optimal therapeutic strategies yet. Even with a relative sensitivity to neoadjuvant chemotherapy (NACT), and up to 39% of tumor pCR, patients with residual carcinoma have significantly worse prognoses (triple-negative paradox).49–53 It is obvious that this is still a heterogeneous group with diversified molecular constituents, and further classification is required for individualized management to improve a patient's prognosis.
Efforts have been made to further classify TNBC in the past years. Using a 2188-gene set, Lehmann et al54 analyzed 21 publicly available data sets with 587 cases of primary BLBC and found that based on their GEP, tumors could be further classified into 6 subtypes (Vanderbilt subtypes): basal-like 1 (BL1), basal-like 2 (BL2), immunomodulatory (IM), mesenchymal (MES), mesenchymal stemlike (MSL), and luminal androgen receptor (LAR). The findings were challenged by Prat et al,55 who found that the gene expression profiles in the IM and MSL groups were derived from tumor stromal cells rather than cancer cells. In an additional study, Lehmann et al56 also concluded that tumor-infiltrating lymphocytes (TILs) contributed significantly to the GEP of the LM subtype. Correlation with this signature should be considered as a descriptor of the immune state of the tumor rather than an independent subtype. Meanwhile, the MSL subtype comprised tumors with a high abundance of tumor-associated mesenchymal tissue. Therefore, Lehmann et al refined their classification into 4 sustainable subtypes: BL1, BL2, MES, and LAR, and stated that each had distinctive clinicopathologic characters. For example, BL1 is the largest group, comprising about 35% of the cases. This group demonstrates the best responses to NACT with the highest number of cases achieving pCR, and therefore has the best overall survival and disease recurrence–free survival. Comparably, BL2 has the lowest pCR to NACT, with a worse disease recurrence–free survival. The M subtype is characterized by a lack of lymphocytic infiltrates and low lymph node metastasis but high lung metastasis. Lobular carcinoma exclusively falls into LAR, which, not surprisingly, has a lower histologic grade, with frequent lymph node and bone metastases. Detectable hormonal receptor (ER and AR) transcriptions are also a feature of this subtype. This system could subclassify 98% of tumors.
It is interesting that although the IM subtype was eliminated from the Lehmann system because of the remarkable TILs, which intervene in the gene profiling of tumor cells, TILs have become an important target for immunotherapy of breast cancer.
A few years later, Burstein et al,57 using RNA profiling and DNA copy number segmentation techniques, analyzed 198 previously uncharacterized TNBCs and validated the results through an external set of 220 TNBC cases. They classified TNBC into 4 stable subtypes (Baylor subtypes): LAR, MES, basal-like immune suppressed (BLIS), and basal-like immune activated (BLIA). It was found that these subtypes are biologically diverse, activate distinct molecular pathways, have unique DNA copy number variants, and exhibit distinct clinical outcomes. In this system, BLIS has the worst prognosis, whereas BLIA has the best prognosis, seemingly reflecting the role of TILs. Within our current understanding, TILs consist of a remarkably complex population of lymphoid cells in tumor stroma, some of which pursue a role of immune defense against tumor cells, whereas others, such as programmed death ligand-1 (PD-L1), help tumor cells escape the immune response. Some of the gene amplifications could be detected by FISH studies, which could identify new targets for specific antagonists, inhibitors, or specific drugs. Although the subtype terms selected by the BLIS and BLIA are similar, their profiles are not well aligned.58 Among the subtypes, LAR has the highest overlap.
Integrated analysis of both genomic DNA copy number alterations and transcriptomic gene expression profiling on molecular driver genes has been used for breast cancer classifications. Curtis et al59 analyzed more than 2002 breast cancers from the Molecular Taxonomy of Breast Cancer International Consortium data set, including 997 in the discovery set and 995 in the validation set. They identified 10 integrated clusters (IntClust), each with distinct molecular features and prognoses. The results were further validated in more than 73 500 samples using an external data set.60 It confirms the genetic purity of HER2+ cancer, which falls into only IntClust 5, whereas the luminal types (A and B) have the most diversified genetic alterations. Interestingly, most of the BLBCs present in IntClust 10 with poor prognosis, whereas a small portion are found in IntClust 4 with good prognosis, probably corresponding to the specific group of low-grade BLBCs.34,61 This system appears promising in characterizing a tumor's behavior and prognosis, but its practical value is still pending for further assessment.
Next-generation sequencing (NGS) is a high-throughput technique, looking for all the genomic alterations of a tumor, including complex genomic abnormalities simultaneously, such as single-nucleotide variants, short insertions/deletions, copy number variations, and fusions. This method has identified subsets of gene expression and somatic mutations that reflect tumor biologic behavior accurately and may lead to further effective therapeutic targets, better prognosis, and improved outcomes. However, its value in breast cancer classification is not well defined. This is particularly relevant to TNBC, the most aggressive type of breast cancer.62 Early study of the Cancer Genome Atlas Network revealed that BLBC has a significantly high frequency of tumor protein p53 (TP53) mutations. In comparison, luminal A cancer has a high mutation frequency of PIK3CA but a low mutation frequency of TP53,63 suggesting an association with different prognoses and management strategies. Dillon et al64 studied 20 cases of primary TNBC where 358 gene mutations were detected, including single-nucleotide variation, small insertions, deletions, and copy number variants. Among these, the MYC amplification was the most common alteration (75%), and it was particularly high in BLBC (89%). Weisman et al65 found in 39 TNBC cases that the most common mutation was TP53 (74%), but the MYC amplification was found in only 26% of the cases. Interestingly, within their cohort, 4 cases of apocrine TNBC were found to harbor a much lower frequency of TP53 mutation (25%), and MYC amplification was not identified. High-frequency mutations in PIK3CA and other PI3K signaling pathway–related genes (75%) were noted, indicating a distinctive subset with genetic alterations similar to luminal subtype breast cancer. Early-onset breast cancer in young patients tends to have a more aggressive clinical course and worse outcome. In a small NGS study on 32 breast cancer patients younger than 40 years, Andrikopoulou et al66 found that the most common somatic mutations were PIK3CA and TP53, similar to those in older patients. However, CHEK2 germ line mutations were more frequent in this subgroup. The authors highlighted the need for more genetic testing in young patients. For people with metastatic breast cancer, the goal of molecular testing is to find their tumor's specific mutations, and then to target those mutations with an already approved therapy, or a therapy being studied in clinical trials, in order to increase survival and quality of life. NGS has shown that recurrent/metastatic breast cancer lesions may have additional genetic changes compared with the primary tumor. These additional changes may be related to tumor progression and/or drug resistance. Using NGS, Hempel et al67 studied a group of 41 patients with metastatic breast cancers, among which were 16 TNBCs, combined with immunohistochemistry for hormone receptors, HER2, and PD-L1. They found that 27 patients had more than 1 genetic alteration, and the most common alterations were PIK3CA and ERBB2. Overall, 68% of the alterations helped guide further clinical treatment decisions.
The challenge for NGS is that the large numbers of genetic alterations identified might not be clinically actionable. Identified mutations do not guarantee therapeutic response, and targeting inhibitors has to be proven to be clinically effective. A recent study compared the therapeutic effects of genomically directed therapy versus treatment of physician choice in 193 TNBC patients after NACT, and found that genomically directed therapy was not superior to treatment of physician choice for patients with residual TNBC after NACT.68 It is clearly indicated that more studies are required to further explore the role of NGS testing in the management of breast cancer patients.
SUMMARY
Molecular classification has provided significant amounts of additional information leading to individualized therapies for breast cancer patients. It also identifies novel cancer driver genes and potentially targetable biomarkers. Our capability to predict a patient's prognosis with variable types of breast cancer has been remarkably enhanced, although the predictive scale on efficiency of NACT and conventional chemotherapy is still limited in TNBC/BLBC patients. Challenges include the lengthy process to transfer a potential assay from bench to bedside, no guarantee of gene alterations toward therapeutic response, cost-effectiveness, and quality control issues. Meanwhile, we still eagerly await more specific predictive assays to meet the need for personalized therapy. Clinical implementation of these new findings is still in its infancy, but there is great potential for better prognostic stratification and concomitant improved therapeutic interventions. Finally, molecular classification should serve as an addition to our routine histopathologic evaluation of breast cancers, not a replacement.
The author would like to acknowledge Thomas Holdbrook, MD, pathologist, for his critical proofreading and improvement of the manuscript.
References
The author has no relevant financial interest in the products or companies described in this article.
Presented in part at the Eighth Princeton Integrated Pathology Symposium; April 11, 2021; virtual.