Context

Tumors of uncertain or unknown origin are estimated to constitute 3% to 5% of all metastatic cancer cases. Patients with these types of tumors show worse outcomes when compared to patients in which a primary tumor is identified. New molecular tests that identify molecular signatures of a tissue of origin have become available.

Objective

To review the literature on existing molecular approaches to the diagnosis of metastatic tumors of uncertain origin and discuss the current status and future developments in this area.

Data Sources

Published peer-reviewed literature, available information from medical organizations (National Comprehensive Cancer Network), and other publicly available information from tissue-of-origin test providers and/or manufacturers.

Conclusions

Molecular tests for tissue-of-origin determination in metastatic tumors are available and have the potential to significantly impact patient management. However, available validation data indicate that not all tests have shown adequate performance characteristics for clinical use. Pathologists and oncologists should carefully evaluate claims for accuracy and clinical utility for tissue-of-origin tests before using test results in patient management. The personalized medicine revolution includes the use of molecular tools for identification/confirmation of the site of origin for metastatic tumors, and in the future, this strategy might also be used to determine specific therapeutic approaches.

Unknown or uncertain primary cancer (UPC), also referred to as carcinoma of unknown primary (CUP), accounts for an estimated 3% to 5% of all metastatic cancers.1 Strictly speaking, the diagnosis of UPC requires a biopsy-proven metastatic malignancy and no identifiable primary tumor after a thorough clinical evaluation that includes physical examination and laboratory and imaging diagnostic tests.2 However, the frequency at which a pathologist performs a “UPC workup” has been estimated as double the number of cases of UPC.3 For cases designated as UPC/CUP after routine diagnostic evaluation, the source of these tumors is identified only 20% to 30% of the time ante mortem, even after extensive clinical, imaging, and immunohistochemical (IHC) workups.4 Unfortunately, prognosis of patients for whom a primary site is not identified is poor, with median survival ranging from 6 to 10 months, in clinical studies of unselected patients with UPC/CUP, to 2 to 3 months in other studies.2 Comparatively, patients for whom the primary source of cancer is identified have longer survival.5 Cases of UPC/CUP remain a source of frustration and psychological burden for both the patient and the medical team.

It is this type of clinical need, and the emergence of specific and more effective therapy regimens designed to combat metastatic disease from unique identifiable sites, that have resulted in the quest for better and more accurate identification of these tumors. To address this need, new molecular tests that identify molecular signatures of a tissue of origin have become available. In this article, we review these molecular approaches for determination of the tissue of origin in cases with an uncertain primary cancer (including but not limited to UPC/CUP).

Gene expression microarrays (GEMs) appeared in the mid 1990s as a powerful tool to analyze the expression of hundreds to thousands of genes simultaneously.6 Microarrays consist of a small solid support surface with a glass slide, silicon chip, or nylon membrane that has thousands of DNA probes imprinted, spotted, or synthesized directly to the support. Probes for specific genes/messenger RNA (mRNA) transcripts are restricted to a specific location on the array. Gene expression microarrays work by exploiting the propensity of a given nucleic acid sequence to bind specifically (hybridize) to a complementary DNA sequence and the ability to detect the amount of hybridized nucleic acid by fluorescence detection.7 In an oversimplified description, gene expression analysis (or gene expression profiling) starts with RNA isolation from the tissue of interest (Figure). When isolated, the RNA is reverse transcribed and converted into complementary DNA (cDNA). A biotin or fluorescent label is usually incorporated into the cDNA during reverse transcription. The cDNA is then hybridized to a microarray that contains probes for hundreds to thousands of genes/transcripts. Detection of a fluorescent signal in a specific location of the microarray indicates the presence of the transcript complementary to that probe and the signal intensity indicates the abundance of the transcript in the sample. After the data are analyzed by bioinformatics software that performs background correction and normalization, a gene expression profile is obtained (Figure). Thus, in a single experiment, the expression level of hundreds to thousands of genes can be measured.7 

Figure 1

Schematic view of a gene expression microarray assay for molecular profiling of human tumors.

Figure 1

Schematic view of a gene expression microarray assay for molecular profiling of human tumors.

Close modal

For more than a decade, GEMs have been used to study gene expression profiles of almost every diseased tissue, and especially malignant neoplasms. Some of the applications of GEMs have been for tumor identification or classification, risk assessment, prognosis, drug development, prediction of drug response, and tracking of disease progression/evolution. One of the first areas of microarray-based research was concerned with the attempt to classify tumors into known categories, such as leukemia and lymphoma, by gene expression profiles.8,9 These efforts were soon followed by experiments attempting to classify multiple tumor types based on their gene expression profiles.10,11 

In the first of these studies, Su and coworkers10 used GEMs, along with supervised machine learning algorithms, to generate a classifier based on a 110-mRNA transcript profile that was successful in classifying, with high confidence, 85% of samples from an independent test set that included 11 tumor types (n  =  75). Shortly thereafter, Ramaswamy and coworkers11 developed a support vector machine classifier for 14 tumor types using 16 063 transcripts. This classifier achieved an overall prediction accuracy of 78% in the independent test set evaluated (n  =  54). Interestingly, this algorithm was unable to classify tumors of poorly differentiated morphology. These first studies were mainly focused on demonstrating the feasibility of classifying tumors by their tissue of origin and exploring the classification algorithms best suited for this purpose. Even at this early stage, Ramaswamy and coauthors11 recognized the potential clinical use of this technology for “the diagnosis of clinically ambiguous tumors.”

In a study to identify different subtypes of lung carcinomas, Bhattacharjee et al12 identified previously unrecognized metastases of extrapulmonary origin and suggested a role for GEM analysis in confirming the origin of metastatic tumors in the lung. In 2002, Dennis and coworkers13 suggested that a reverse transcription-polymerase chain reaction (RT-PCR)–based clinical assay for identification of tissue of origin in adenocarcinomas of unknown origin could be developed on the basis of data from serial analysis of gene expression (SAGE), microarray studies, and other sources. They studied the expression of 11 candidate genes in different tissues by an RT-PCR assay to test this approach. However, only 7 of the 11 genes showed expression patterns similar to those shown by SAGE. This study highlighted the difficulty in translating research findings into a clinical assay. Furthermore, in 2003 Tan and colleagues14 showed that gene expression data were not reproducible when using different commercial microarray platforms and they raised concerns about the reliability of microarray-based gene expression assays. This study, and other works, highlighted the need for strict quality control in the development of microarray-based clinical tests.15,16 

In 2004, Bloom and others17 developed an artificial neural network–based gene expression classifier that was successful in identifying the tissue of origin in 85% of tumors profiled in multiple different platforms and laboratories (n  =  140), thus indicating that a robust clinical assay could be developed using microarrays. In 2005, Tothill et al18 developed a microarray-based gene expression classifier that achieved an internal accuracy of 89% in classifying 13 tissue classes. In this study, they showed the importance of having a diverse training sample set that included histologic subtypes from each tissue class. They then translated the classifier into an RT-PCR platform with 79 genes that showed high accuracy (>90%) in classifying 5 tissue classes and applied it to 13 CUP cases. They were able to obtain high-confidence predictions for site of origin in 11 cases, all of which were consistent with clinical-pathologic information. Also in 2005, a consortium of academic institutions and industry, in coordination with the US Food and Drug Administration (FDA), published seminal papers showing that the concordance between different microarray platforms had improved substantially thanks to advances in gene annotation and array design.19 It was also shown by these and other authors2022 that high reproducibility in microarray results among laboratories could be achieved with the use of standardized protocols and array platforms.

The studies summarized above demonstrated the feasibility of using gene expression profiling to classify uncertain tumors according to their tissue of origin and set the stage for the development of commercially available clinical tests for this purpose (Table). Two main strategies were followed by the test developers: (1) the exploitation of information generated from GEM studies to develop RT-PCR–based assays and (2) the development of assays by using microarray platforms. Currently, only 1 such assay has been reviewed and cleared by the FDA: the Pathwork Tissue of Origin Test (Pathwork Diagnostics, Redwood City, California), which is a 1550-gene microarray-based test. Other tests available in the United States as laboratory-developed tests (LDTs) are Theros CancerTYPE ID (bioTheranostics, San Diego, California), a 92-gene real-time quantitative reverse transcription–polymerase chain reaction (qRT-PCR) assay; and the miRview mets test (Rosetta Genomics, Philadelphia, Pennsylvania), a 48–microRNA qRT-PCR assay. Another microarray-based test, the 1900-gene CupPrint (Agendia BV, Amsterdam, The Netherlands) is offered clinically in Europe but not in the United States. Lastly, the CUP assay, a 10-gene qPCR assay (Veridex, La Jolla, California) has been developed but is not yet clinically available. Although some of these tests are currently being offered (and used) for clinical purposes, the publicly available information on performance characteristics might not adequately support the intended clinical use.

Table 1

Molecular Tests for Tissue-of-Origin Determination

Molecular Tests for Tissue-of-Origin Determination
Molecular Tests for Tissue-of-Origin Determination

The translation of multigene expression assays into diagnostic or prognostic classification tests has well identified requirements23 and pitfalls that should be addressed in validation studies.24 Classification algorithms usually perform best when used to classify samples used in the classifier development (overfitting to the training set); thus, validation with a large and independent sample set is paramount for establishing true performance of a given classifier. In this regard, Simon25 outlined key steps that should be taken into account when developing and validating therapeutically relevant genomic classifiers: (1) ensuring that the classifier addresses a specific and important clinical decision, (2) ensuring that the classifier shows sufficient accuracy in internal validation to assess further development, (3) translation to a platform for broad clinical application, (4) demonstration of reproducibility, and (5) independent validation of the prespecified classifier. It has also been recommended that validation studies show (1) adequate sample size for validation, to statistically demonstrate that classifications are accurate; (2) validation in all classes for which it was created, with enough specimens for each class; and (3) inclusion of indeterminate results in reported performance.24,25 Clearly, genomic classifiers for tissue-of-origin determination do address a clinically important question that impacts treatment decisions for patients with uncertain primary cancers; thus, the first requirement is fulfilled for all these tests. In the following paragraphs, we review the publicly available evidence for each of the commercially available tests and evaluate the above parameters (aside from clinical utility) in these molecular tests for the determination of tissue of origin.

Theros CancerTYPE ID

The first gene expression–based test for tumors of uncertain origin to be clinically available was the CancerTYPE ID test, developed by AviaraDx (now bioTheranostics). This test is currently offered by bioTheranostics as an LDT in its laboratory accredited through the Clinical Laboratory Improvement Amendments of 1988 (CLIA) (http://www.biotheranostics.com/products-services/hcp/ctid/; accessed June 29, 2009). This test is an RT-PCR 92-gene assay that is reported to classify 39 different tumor types and 64 subtypes. To develop this test, Ma and coworkers26 used GEM to generate profiles from 466 frozen tumors (75% primary and 25% metastatic). Using a genetic algorithm and K–nearest neighbor procedure, they developed several classifiers, from which the top performer (a 74-gene classifier) had an overall accuracy of 86% for predicting tissue of origin in an independent set of 112 formalin-fixed, paraffin-embedded (FFPE) samples. They then evaluated 126 genes from these classifiers for translation into an RT-PCR assay and selected 87 classification genes and 5 reference genes to construct a 92-gene assay for tumor classification. The 92-gene RT-PCR assay was evaluated with 119 FFPE tumor samples (including the original 112) representing 30 tumor classes and showed an overall accuracy of 82%.

On the basis of the published evidence, the classifier used in the Theros CancerTYPE ID test has shown sufficient accuracy (82%) in the internal validation, which has warranted further development of the assay. The assay was successfully translated to an RT-PCR platform that allows broad clinical application and importantly, allows for the use of FFPE specimens. However, the reproducibility of this classifier has not been adequately shown, since there have been no subsequent publications. Furthermore, although the test was evaluated on an independent sample set, this set had only 119 tumors to represent 30 tumor classes; thus, representation from each tumor type ranged from 1 to 10 specimens, with 18 tissue types being represented by 3 samples or less. Based on these data, the reported sensitivity and specificity for a specific tumor type might only reflect the correct classification of 1 specimen (eg, breast). This sample size and class representation clearly falls short of the requirements outlined by Simon et al24,25 and of general recommended principles for clinical molecular pathology tests.27 One also needs to be aware that although the laboratory indicates that the test has the ability to classify 39 tumor types (http://www.biotheranostics.com/products-services/hcp/ctid/; accessed June 29, 2009), the published data show that the FFPE sample set contained only 30 tumor classes.26 

Pathwork Tissue of Origin Test

The second test to be offered clinically in the United States was the Pathwork Tissue of Origin Test developed by Pathwork Diagnostics for both frozen and FFPE tissues. The frozen tissue version of this test was reviewed by the FDA and approved in July 2008 to be marketed as an in vitro diagnostic (IVD) device.28 However, no IVD kit has been released yet, and thus this test is currently only offered by Pathwork as an LDT in its CLIA-accredited laboratory (http://www.pathworkdx.com/TissueOfOriginTest/; accessed September 17, 2009). This test is a 1550-gene microarray-based assay that is reported to classify tumors into 15 known tissue types, representing 58 morphologic features. The test uses a proprietary microarray (PathChip) manufactured by Affymetrix (Santa Clara, California) and runs on Affymetrix's FDA-approved clinical instrumentation. To develop this test, Moraleda and coworkers29 used GEM to generate profiles from 5539 human tissue specimens to develop a 121-gene standardization algorithm that allowed comparison of gene expression data from different laboratories. They then developed a classification algorithm from gene expression profiles of 2039 tumors comprising 15 tissue types and 60 different morphologic features. The training set included both primary and metastatic tumors and well-differentiated to undifferentiated tumors. The test's proprietary algorithm reports a similarity score that ranges from 0 to 100 for each of the 15 known sites of origin. For the frozen-tissue version of the test, a similarity score of 30 or more is considered evidence that the specific tissue is present in the sample. A similarity score of less than 5 allows the site of origin to be ruled out and a similarity score below 30 but above 5 is classified as indeterminate. Dumur et al30 evaluated the analytic performance and reproducibility of this test at 4 laboratories by using archival frozen tissue from 60 poorly to undifferentiated primary and metastatic tumors. They showed the test had good reproducibility in the standardized expression values, similarity score, and final tissue calls between sites. Although this study did not have sufficient statistical power to evaluate clinical performance, average percentage agreement between the test result and the reference diagnosis was 86.7% (range, 84.9%–89.3%). In a subsequent article, Monzon et al31 reported a multicenter validation study with 547 samples (minimum representation for each tissue type of 25 specimens for each of the 15 tissues in the test). The study showed an overall accuracy of 87.8% (95% confidence interval [CI], 84.7%–90.4%) and overall specificity (negative percentage agreement with reference diagnosis) of 99.4% (95% CI, 98.3%–99.9%).

The published evidence indicated that the classifier used in the Pathwork Tissue of Origin Test showed sufficient accuracy (86.7%) in the internal validation to warrant further development of the assay. The assay was not translated to an RT-PCR platform, but it was developed on a GEM (PathChip) that showed adequate reproducibility in an interlaboratory comparison study; thus, the platform appears suitable for clinical application.30 The test was then validated on an independent sample set of sufficient size (547 samples), class representation (at least 25 samples per tissue type), and inclusion of indeterminate results to meet the criteria for successful translation outlined above,24,25 as was also judged by the FDA.28 It is important to note that the performance characteristics that have been published to date pertain to the test performed on frozen tissues, which is the version approved by the FDA. Limited performance information about the FFPE version of the test has just been released in an abstract from the 2009 annual meeting of the American Society of Clinical Oncology,32 but a peer-reviewed publication of these data is not available yet.

miRview mets

Another test available in the United States is the miRview mets test from Rosetta Genomics (Philadelphia, Pennsylvania), which is also offered as an LDT. This test is an RT-PCR 48-microRNA (miRNA) assay that is reported to classify 25 different tumor types (http://www.mirviewdx.com; accessed June 29, 2009). MicroRNAs are short RNA molecules (21 to 25 nucleotides) that belong to a class of noncoding, regulatory RNAs that modulate gene expression and participate in developmental and oncogenic processes.33 The ability to classify tumors based on their tissue of origin, with miRNA profiles, was first shown by Lu and coworkers,34 who suggested that an miRNA-based assay could achieve better discrimination than one based on mRNA. Rosenfeld and coworkers35 used miRNA microarrays to generate profiles from 253 samples (most were FFPE) representing 22 different tumor classes and built 2 different classifiers from these data by using a decision tree and a K–nearest neighbor algorithm. They then applied these algorithms to 83 blinded samples that had also been profiled with miRNA microarrays. The authors considered a classification accurate when any 1 of the 2 algorithms correctly identified the tissue of origin (so called “union classifier”) and, with this definition, reported 86% accuracy. However, the algorithms agreed only in 66% of the cases (high-confidence classification), of which 89% were correct. This indicates that 11% of the time a high-confidence classification was incorrect and that concordance of both algorithms and tissue of origin (accuracy) was only 59%. Importantly, representation of tissue classes in the blinded test set ranged from 1 (testis) to 8 (head and neck). The test was translated to a qPCR platform and evaluated in 80 samples (65 new/15 original; 12 frozen/68 FFPE); however, for this independent sample set, the authors only reported performance for distinguishing between liver/nonliver and GI/non-GI samples. Again, representation of tissue classes in this independent test set ranged from 1 (ovary) to 12 (bladder).

One of the advantages of using an miRNA classifier is that this type of nucleic acid is readily retrievable from FFPE samples and is not significantly affected by fixation, paraffin embedding, and storage time. However, the performance of the miRview mets classifier is difficult to evaluate because of the 2 algorithms that showed agreement in only 66% of samples. Although Rosenfeld and coauthors35 focus their discussion on the decision-tree algorithm, the reported performance was based on the “union classifier” (accuracy for the decision tree alone was 72% in the internal validation). Importantly, the authors did not discuss how discrepancies between the classifiers would be handled in clinical practice. Thus, on the basis of published evidence, the classifier used in the miRview mets has shown lower accuracy when compared to other assays.35 It is not possible to determine if the assay was successfully translated to the qRT-PCR platform because of the lack of published data, including data regarding the reproducibility of this test. As with one of the other tests reviewed above, given the small independent sample set used, the reported sensitivity and specificity for a specific tumor type might only reflect the correct classification of 1 specimen (eg, testis or ovary).

CupPrint

The CupPrint test is a 1900-gene GEM test that was developed by Agendia BV (Amsterdam, The Netherlands) and is only available outside the United States in Agendia's CLIA-certified laboratory. This test is reported to classify 49 different tumor types from 11 systems (http://row.agendia.com/en/cupprint.html; accessed June 29, 2009). To develop this assay, Agendia licensed the database of gene expression profiles from tumors of unknown origin used for the development of the bioTheranostics assay.26 However, instead of only 92 genes, this test uses a GEM to measure gene expression for 495 mRNA transcripts for sample classification by using a 5–nearest neighbor algorithm plus additional transcripts used for data normalization.36 The algorithm determines the 5 most molecularly similar tumors in the CupPrint database and produces a score composed of weightings for the biopsy site, gender of patient, and algorithm analysis findings. The predicted site of origin is the tumor in the database with the most similarity to the sample. Horlings and collaborators36 described the evaluation of this assay with an independent set of 84 tumors (80% metastatic) from 9 tissue types, for which they obtained 83% accuracy in tissue-of-origin classification. Most misclassified samples were from lung (7 of 11) and pancreas (3 of 3). In this study, representation from different tissue types ranged from 5 (thyroid) to 16 (breast). In a subsequent study by Bridgewater et al,37 the CupPrint test was applied to FFPE tumor samples from 21 patients diagnosed with carcinoma of unknown origin. They studied reproducibility by analyzing 1 sample in triplicate and 2 independent metastases from 1 patient. There was consistency between identical samples analyzed independently.

Since this assay uses the same internal validation data as the Theros CancerTYPE ID test, it is clear that this classifier showed sufficient accuracy (82%) in the internal validation26 for further development of the assay. For the CupPrint assay, though, Agendia did not translate the assay to an RT-PCR platform but used a customized microarray instead—the same one used for its MammaPrint assay,38 for which they have obtained medical device registration in the European Community. The technical reproducibility of this microarray platform was shown in an article describing the validation of this array38; however, Bridgewater et al37 showed limited reproducibility studies with the CupPrint algorithm (on FFPE samples) with only 2 samples. In terms of independent validation, the CupPrint test was evaluated on an independent sample set of 84 primary and metastatic tumors representing 9 tissue types. Although minimum representation from each tumor type in this validation set is 5 specimens (eg, ovary), the sample size and class representation again falls short of the requirements outlined above.24,25,27 Agendia reports an 86% accuracy with an in silico validation of the CupPrint assay with the gene expression data from the study of Tothill et al.18 However, these validation data are not publicly available.

CUP Assay

The CUP assay, developed by Veridex (La Jolla, California), evaluates the expression of 10 tissue-type specific gene markers by using quantitative RT-PCR and is designed to detect tumors from 6 specific sites: lung, breast, colon, ovary, pancreas, and prostate. Talantov and collaborators39 reported the development of this assay by selecting 23 tissue-specific marker candidates from existing literature and databases and confirming their differential expression in 205 FFPE tumor samples. They then optimized the 10-gene RT-PCR assay, to be performed as a 1-step assay, and developed an algorithm based on RT-PCR results from a second group of 260 FFPE samples (training set) of which 239 gave usable results. Classification accuracy was 78.5% in the training set. Importantly, when tumors other than the 6 targeted tissue sites (n  =  32) were tested, they were misclassified 50% of the time. The CUP assay was then evaluated in an independent set of 37 samples with known origin and 11 UPC samples. In the 37 known samples, tissue representation ranged from 2 (prostate and ovary) to 9 (lung) and showed a classification accuracy of 75.6%. In a recent study, Varadhachary and collaborators40 used the CUP assay to study cases of 120 patients with CUP. The assay was performed successfully with 104 samples, with a specific tissue-of-origin assignment (within the 6 sites targeted by the assay) for 63 patients, with the remaining 41 samples classified as “other.” Internal consistency was confirmed by having the same assay results at all sites for 4 patients with multiple biopsies. Comparison with IHC results from these CUP cases indicated that the CUP assay suggested a specific tissue of origin more often than extensive IHC testing. They concluded that the CUP assay diagnosis was clinically useful in most cases. Importantly, cases that were molecularly identified as colon cancer showed better response to cancer-specific therapy when compared to standard CUP regimens.

The internal validation data for the CUP assay showed that this classifier had promising accuracy (78.5%). The development and optimization of this assay was done on an RT-PCR platform that allows for easy clinical implementation. The fact that the assay only measures 10 markers (plus 2 control genes) allowed the manufacturer to develop a single-tube methodology, which is desirable for a clinical test. However, the reduction in the number of genes evaluated limits the number of tissues that can be distinguished by this assay. In this regard, it is important to note that 50% of tumors outside the 6 targeted tissue types were incorrectly assigned to 1 of these tissue types. In addition, as with other assays discussed above, the independent validation cohort is quite small (n  =  37), with prostatic and ovarian tissues being represented by 2 specimens each. Thus, true clinical performance characteristics of this assay and tissue-specific performance are not known. Consistency of results for the CUP assay among samples from the same patient was reported by Varadhachary et al.40 However, more extensive reproducibility studies have not been published.

Unknown primary cancer is an important clinical problem that generates frustration among surgeons, oncologists, and pathologists, in addition to the uncertainty and stress it imposes on the patient. Incidence of UPC has been reported as 2% to 4% of all malignancies in 2 European countries,41,42 and in the United States it has been estimated that there will be 31 490 cases of cancer with unspecified primary site in 2009.43 As summarized above, several studies have demonstrated the feasibility of using focused or genome-wide gene expression profiling to classify tumors according to their tissue of origin. The ability to molecularly classify tumors, and the advances in clinical microarray and PCR technologies have resulted in the development of assays for tissue-of-origin identification intended for clinical application (Table). Oien3 has estimated that cases that undergo a UPC workup might be approximately double the number of UPC cases reported. Given the frequency of this problem, tissue-of-origin identification is a dilemma that all surgical pathologists are faced with, at one point or another. Although many cases that require a UPC workup can be adequately resolved by the use of IHC and consultations with expert pathologists, radiologists, and oncologists, in many cases there is still uncertainty even after a full evaluation. For example, in a recent case at our institution, a metastatic tumor with an IHC profile of breast cancer had to be molecularly confirmed after multiple imaging modalities failed to confirm the presence of a breast mass.

Molecular testing for tissue-of-origin identification is now a reality and has the potential to become an important tool in the pathologist's diagnostic armamentarium. This approach can be useful in supporting a suspected diagnosis, suggesting new diagnostic possibilities, and excluding diagnoses included in the tissue-of-origin differential. However, given the options available (see above), pathologists are faced with deciding which test is better to use for their patients. This is complicated by the number of choices available; the use of aggressive marketing by some laboratories; and the fact that the use of “black box” algorithms generates skepticism among pathologists like us who are used to having control and comprehensive understanding of all testing derived from “our” tissue samples. The most important question is to determine if the test being offered has been adequately validated for clinical use. As mentioned before, guidelines for translation of genomic classifiers and for validation of clinical molecular tests have been published and one can evaluate tissue-of-origin tests with these parameters.24,25,27 Most of the available tests have shown promising results in the internal validations and have been translated to RT-PCR or robust microarray platforms. However, after a test has gone through internal validation (or development), an external validation is the only way to determine whether the assay will perform adequately with samples other than those used for its development. An adequate external validation needs to have a statistically valid sample size, inclusion of enough specimens for each class to be identified, and inclusion of indeterminate results in overall performance.24,25 Peer-reviewed evidence shows that the quality of the external validation studies for the available tests is quite variable. Except for one study, validation studies have been restricted to a small number of specimens and do not have adequate representation for all tissue types being evaluated. In some cases, a claim is made of 100% sensitivity for a specific tissue type, with only 1 or 2 specimens for that tissue type in the validation samples. In addition, specificity (ie, how often the negative result is correct) is only reported for Pathwork's Tissue of Origin Test (99.4%) and bioTheranostics' Theros CancerTYPE ID (>99%), although publicly available data exist only for Pathwork's test.31 

How can a test that does not fulfill these validation criteria be offered clinically? Under current US regulations, laboratory directors have discretion in the evaluation and approval of validation studies in a CLIA-certified laboratory. This applies to all LDTs, which is the format in which these tests are currently being clinically offered. The development of diagnostic or prognostic tests with proprietary algorithms that are based on multianalyte platforms (such as microarrays) prompted the FDA to create a new category of tests, the in vitro diagnostic multivariate index assays (IVDMIAs), and to propose an approval process for such tests.44 An IVDMIA was defined as “a device that combines the values of multiple variables using an interpretation function to yield a single, patient-specific result (eg, a “classification,” “score,” “index,” etc), that is intended for use in the diagnosis of disease . . . , and provides a result whose derivation is non-transparent and cannot be independently derived or verified by the end user.” 44 Although all of the available tissue-of-origin assays fall into the IVDMIA definition, the FDA has not finalized this guidance and, thus, laboratories can still perform IVDMIAs without FDA approval (as an LDT). In fact, only 1 of the tests available in the United States, the Pathwork Tissue of Origin Test (frozen-tissue version), has undergone FDA review for its clinical validation data and became the second IVDMIA to receive clearance by this agency.

Apart from the issues of validation, many pathologists and oncologists still have reservations about the value and reliability of these tests. One of the common criticisms regarding these assays is the fact that most published studies have used samples with known origin to evaluate assay performance. Since these assays are intended to assist in samples with uncertain origin, there are concerns about how well gene expression profiles from known tumors reflect the biology of CUP samples.45,46 By definition, the tissue type of a CUP sample is not known, and thus it cannot be used as a gold standard. For this reason, all studies reporting the development and evaluation of tests for tissue of origin need to establish performance with samples from known tissue types. This procedure is no different to what is commonly done to judge whether an antibody used for IHC is specific for a tumor or tissue type. Most, if not all, studies evaluating the use of antibodies for distinguishing between tumor types are done with panels of known and well-characterized tissues.47,48 

Certainly, performance with known tissue types does not necessarily translate to established performance with UPC/CUP cases; however, it does reflect the likelihood of a correct call, when one tests a sample of uncertain origin. There have been a few recent studies that have evaluated some of these gene expression assays on CUP specimens. Horlings et al36 applied the CupPrint assay to tumor samples from patients with CUP who were subdivided in 3 groups: (1) patients presenting with CUP, with tissue of origin identified by IHC (n  =  16), and for whom the test showed concordance with IHC diagnosis in 93.8% of cases; (2) patients with CUP, with differential diagnosis of 2 or 3 sites after IHC (n  =  12), and for whom the test predicted a single origin, concordant with clinicopathologic information in 8 of 12 cases; and (3) patients with UPC, with no suspected primary site, and for whom the test predicted a single origin, concordant with the clinical suspicion in 6 of 10 cases. In another study using the same test, Bridgewater et al37 reported clinically compatible results in 18 of 21 tumors. The authors speculated that if the predicted site-of-origin information had been available, a change in clinical management may have been seen in 12 of the 21 cases. In a study in which we evaluated the Pathwork Tissue of Origin Test for 21 CUP cases (F.A.M., F. Medeiros, M. Lyons-Weiler, W. D. Henner, unpublished data, 2009), the test identified a probable single primary site in 76% of cases, with all identified sites compatible with the available clinical information. In the largest study of this kind to date, Varadhachary and coworkers evaluated the 10-gene CUP assay in a cohort of 120 patients and identified a putative tissue of origin in 61% of patients.40 Thus, determination of a possible primary tumor in CUP cases with microarray-based tests ranges from 60% to 85% in these studies and, therefore, it appears that gene expression tests would be able to identify the tissue of origin in approximately 7 of 10 patients with true CUPs, potentially decreasing the number of unidentified primary tumors by 70%. Since the number of patients for all these studies is small, there is a need for more studies to evaluate both performance with CUP samples and the clinical impact of this type of molecular test. This general uniformity in call rates for identifying a tissue of origin for patients with CUP, among the various expression tests, does not necessarily imply uniformity in call accuracy or in assay range. By definition, the gold standard for tumor calls in CUP cases is unknowable. Thus, as mentioned above, accuracy in tissue identification can only be evaluated with the use of a large and diverse set of known tumor specimens in the assay development and clinical validation. This type of evaluation is needed before any projections can be made about a test's clinical diagnostic value—much less its potential impact on therapy choices and outcome—in the setting of UPC/CUP.

An important consideration for any new clinical test is the need to determine if patients will derive any benefit from these assays, both in terms of clinical outcomes and cost. As mentioned above, it has been well established that patients for whom a tissue of origin is identified with current diagnostic approaches fare better than patients who remain with an unknown primary tumor.5 The monetary cost for the available tests in the United States ranges between US $3350 and US $3750, while the average cost of a full diagnostic workup for a patient with CUP was estimated to be $18 000 in 199549 and is likely higher currently. If molecular profiling for tissue-of-origin determination substitutes some of the currently performed diagnostics, it has the potential to reduce costs. In addition, tissue-of-origin identification may lead to improvement in therapeutic selection for up to 80% of patients currently classified as having UPC/CUP, thus decreasing the use of costly, ineffective therapies and improving patient outcomes. Of course, whether molecular profiling of tumors from patients with UPC/CUP conduces to reduced costs and improved outcomes needs to be prospectively explored. Importantly, in a recent paper by Varadhachary et al,50 patients treated on the basis of a molecular profile of colorectal origin (based on Veridex's CUP assay) showed better outcomes than patients treated with conventional CUP management. The study was small but had promising results, showing that patient management could possibly be improved by the molecular tissue-of-origin results.

We are now clearly in a new era in the diagnosis of tumors of uncertain origin, which is part of the personalized medicine revolution. Molecular tests for tissue-of-origin determination are available and have the potential to significantly impact patient management. As pathologists, it is our responsibility to understand and evaluate the available molecular tools that can assist us in establishing a primary tumor diagnosis. In addition, new data are emerging that are rapidly changing the homogeneity of tumor classification by morphologic appearance. Genomic tests that identify breast cancer molecular subgroups with different prognosis are already available.51 Thus, it is quite possible that in the future, we will use molecular tools not only for identification of the site of origin but also to identify molecular subsets of poorly and undifferentiated tumors that require specific therapeutic approaches.

Dr Monzon was principal investigator of the analytic and clinical validation study for the Pathwork Tissue of Origin Test and received funding through a sponsored research agreement with Pathwork Diagnostics, LLC. Dr Monzon has also received honoraria and travel funds from Pathwork Diagnostics for speaking engagements related to the Pathwork Tissue of Origin Test and honoraria for consultation in other topics unrelated to this test. Neither Dr Monzon nor Dr Koen holds equity, employment, or leadership position at any of the companies that manufacture/offer tests for tissue-of-origin determination. Dr Koen has no relevant financial interest in the products or companies described in this article.

Note: As of August 1, 2009, Pathwork Diagnostics is no longer offering the frozen version of their test. Only the FFPE version is now available. Also, the miRview mets test from Rossetta Genomics is now also marketed as ProOnc TumorSourceDx by Prometheus Therapeutics and Diagnostics.

1
Pentheroudakis
,
G.
,
V.
Golfinopoulos
, and
N.
Pavlidis
.
Switching benchmarks in cancer of unknown primary: from autopsy to microarray.
Eur J Cancer
2007
.
43
(
14
):
2026
2036
.
2
Abbruzzese
,
J. L.
,
M. C.
Abbruzzese
,
R.
Lenzi
,
K. R.
Hess
, and
M. N.
Raber
.
Analysis of a diagnostic strategy for patients with suspected tumors of unknown origin.
J Clin Oncol
1995
.
13
(
8
):
2094
2103
.
3
Oien
,
K. A.
Pathologic evaluation of unknown primary cancer.
Semin Oncol
2009
.
36
(
1
):
8
37
.
4
Pavlidis
,
N.
and
K.
Fizazi
.
Cancer of unknown primary (CUP).
Crit Rev Oncol Hematol
2005
.
54
(
3
):
243
250
.
5
Bishop
,
J. F.
,
E.
Tracey
,
P.
Glass
,
P.
Jelfs
, and
D.
Roder
.
Prognosis of sub-types of cancer of unknown primary (CUP) compared to metastatic cancer.
J Clin Oncol
2007
.
25
(
18S
):
21010
.
6
Schena
,
M.
,
D.
Shalon
,
R. W.
Davis
, and
P. O.
Brown
.
Quantitative monitoring of gene expression patterns with a complementary DNA microarray.
Science
1995
.
270
(
5235
):
467
470
.
7
Tefferi
,
A.
,
M. E.
Bolander
,
S. M.
Ansell
,
E. D.
Wieben
, and
T. C.
Spelsberg
.
Primer on medical genomics, part III: microarray experiments and data analysis.
Mayo Clin Proc
2002
.
77
(
9
):
927
940
.
8
Tamayo
,
P.
,
D.
Slonim
,
J.
Mesirov
, et al
.
Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation.
Proc Natl Acad Sci U S A
1999
.
96
(
6
):
2907
2912
.
9
Golub
,
T. R.
,
D. K.
Slonim
,
P.
Tamayo
, et al
.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Science
1999
.
286
(
5439
):
531
537
.
10
Su
,
A. I.
,
J. B.
Welsh
,
L. M.
Sapinoso
, et al
.
Molecular classification of human carcinomas by use of gene expression signatures.
Cancer Res
2001
.
61
(
20
):
7388
7393
.
11
Ramaswamy
,
S.
,
P.
Tamayo
,
R.
Rifkin
, et al
.
Multiclass cancer diagnosis using tumor gene expression signatures.
Proc Natl Acad Sci U S A
2001
.
98
(
26
):
15149
15154
.
12
Bhattacharjee
,
A.
,
W. G.
Richards
,
J.
Staunton
, et al
.
Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.
Proc Natl Acad Sci U S A
2001
.
98
(
24
):
13790
13795
.
13
Dennis
,
J. L.
,
J. K.
Vass
,
E. C.
Wit
,
W. N.
Keith
, and
K. A.
Oien
.
Identification from public data of molecular markers of adenocarcinoma characteristic of the site of origin.
Cancer Res
2002
.
62
(
21
):
5999
6005
.
14
Tan
,
P. K.
,
T. J.
Downey
,
E. L.
Spitznagel
Jr
, et al
.
Evaluation of gene expression measurements from commercial microarray platforms.
Nucleic Acids Res
2003
.
31
(
19
):
5676
5684
.
15
Johnson
,
K.
and
S.
Lin
.
QA/QC as a pressing need for microarray analysis: meeting report from CAMDA'02.
BioTechniques
2003
.
suppl
:
62
63
.
16
Ma
,
C.
,
M.
Lyons-Weiler
,
W.
Liang
, et al
.
In vitro transcription amplification and labeling methods contribute to the variability of gene expression profiling with DNA microarrays.
J Mol Diagn
2006
.
8
(
2
):
183
192
.
17
Bloom
,
G.
,
I. V.
Yang
,
D.
Boulware
, et al
.
Multi-platform, multi-site, microarray-based human tumor classification.
Am J Pathol
2004
.
164
(
1
):
9
16
.
18
Tothill
,
R. W.
,
A.
Kowalczyk
,
D.
Rischin
, et al
.
An expression-based site of origin diagnostic method designed for clinical application to cancer of unknown origin.
Cancer Res
2005
.
65
(
10
):
4031
4040
.
19
Larkin
,
J. E.
,
B. C.
Frank
,
H.
Gavras
,
R.
Sultana
, and
J.
Quackenbush
.
Independence and reproducibility across microarray platforms.
Nat Meth
2005
.
2
(
5
):
337
344
.
20
Bammler
,
T.
,
R. P.
Beyer
,
S.
Bhattacharya
, et al
.
Standardizing global gene expression analysis between laboratories and across platforms.
Nat Meth
2005
.
2
(
5
):
351
356
.
21
Dobbin
,
K. K.
,
D. G.
Beer
,
M.
Meyerson
, et al
.
Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays.
Clin Cancer Res
2005
.
11
(
2, pt 1
):
565
572
.
22
Irizarry
,
R. A.
,
D.
Warren
,
F.
Spencer
, et al
.
Multiple-laboratory comparison of microarray platforms.
Nat Meth
2005
.
2
(
5
):
345
350
.
23
Petricoin
III,
E. F.
,
J. L.
Hackett
,
L. J.
Lesko
, et al
.
Medical applications of microarray technologies: a regulatory science perspective.
Nat Genet
2002
.
32
(
suppl
):
474
479
.
24
Simon
,
R.
,
M. D.
Radmacher
,
K.
Dobbin
, and
L. M.
McShane
.
Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification.
J Natl Cancer Inst
2003
.
95
(
1
):
14
18
.
25
Simon
,
R.
Roadmap for developing and validating therapeutically relevant genomic classifiers.
J Clin Oncol
2005
.
23
(
29
):
7332
7341
.
26
Ma
,
X. J.
,
R.
Patel
,
X.
Wang
, et al
.
Molecular classification of human cancers using a 92-gene real-time quantitative polymerase chain reaction assay.
Arch Pathol Lab Med
2006
.
130
(
4
):
465
473
.
27
Jennings
,
L.
,
V. M.
Van Deerlin
, and
M. L.
Gulley
.
Recommended principles and practices for validating clinical molecular pathology tests.
Arch Pathol Lab Med
2009
.
133
(
5
):
743
755
.
28
FDA clears test that helps identify type of cancer in tumor sample [news release]
.
Silver Spring, MD
US Food and Drug Administration
.
29
Moraleda
,
J.
,
N.
Grove
,
Q.
Tran
, et al
.
Gene expression data analytics with interlaboratory validation for identifying anatomical sites of origin of metastatic carcinomas.
J Clin Oncol
2004
.
22
(
14S
):
9625
.
30
Dumur
,
C. I.
,
M.
Lyons-Weiler
,
C.
Sciulli
, et al
.
Interlaboratory performance of a microarray-based gene expression test to determine tissue of origin in poorly differentiated and undifferentiated cancers.
J Mol Diagn
2008
.
10
(
1
):
67
77
.
31
Monzon
,
F. A.
,
M.
Lyons-Weiler
,
L. J.
Buturovic
, et al
.
Multicenter validation of a 1550-gene expression profile for identification of tumor tissue of origin.
J Clin Oncol
2009
.
27
(
15
):
2503
2508
.
32
Pillai
,
R.
,
R.
Deeter
,
C. T.
Rigl
,
M.
Halks-Miller
,
W. D.
Henner
, and
L.
Buturovic
.
Validation of a microarray-based gene expression test for tumors with uncertain origins using formalin-fixed paraffin-embedded (FFPE) specimens [abstract].
J Clin Oncol
2009
.
27
(
15S
):
e22015
.
33
Waldman
,
S. A.
and
A.
Terzic
.
A study of microRNAs in silico and in vivo: diagnostic and therapeutic applications in cancer.
FEBS J
2009
.
276
(
8
):
2157
2164
.
34
Lu
,
J.
,
G.
Getz
,
E. A.
Miska
, et al
.
MicroRNA expression profiles classify human cancers.
Nature
2005
.
435
(
7043
):
834
838
.
35
Rosenfeld
,
N.
,
R.
Aharonov
,
E.
Meiri
, et al
.
MicroRNAs accurately identify cancer tissue origin.
Nat Biotechnol
2008
.
26
(
4
):
462
469
.
36
Horlings
,
H. M.
,
R. K.
van Laar
,
J. M.
Kerst
, et al
.
Gene expression profiling to identify the histogenetic origin of metastatic adenocarcinomas of unknown primary.
J Clin Oncol
2008
.
26
(
27
):
4435
4441
.
37
Bridgewater
,
J.
,
R.
van Laar
,
A.
Floore
, and
T. V. L.
Van
.
Gene expression profiling may improve diagnosis in patients with carcinoma of unknown primary.
Br J Cancer
2008
.
98
(
8
):
1425
1430
.
38
Glas
,
A.
,
A.
Floore
,
L.
Delahaye
, et al
.
Converting a breast cancer microarray signature into a high-throughput diagnostic test.
BMC Genomics
2006
.
7
(
1
):
278
.
doi:10.1186/1471-2164-7-278
.
39
Talantov
,
D.
,
J.
Baden
,
T.
Jatkoe
, et al
.
A quantitative reverse transcriptase-polymerase chain reaction assay to identify metastatic carcinoma tissue of origin.
J Mol Diagn
2006
.
8
(
3
):
320
329
.
40
Varadhachary
,
G. R.
,
D.
Talantov
,
M. N.
Raber
, et al
.
Molecular profiling of carcinoma of unknown primary and correlation with clinical evaluation.
J Clin Oncol
2008
.
26
(
27
):
4442
4448
.
41
Levi
,
F.
,
V. C.
Te
,
G.
Erler
,
L.
Randimbison
, and
C.
La Vecchia
.
Epidemiology of unknown primary tumours.
Eur J Cancer
2002
.
38
(
13
):
1810
1812
.
42
van de Wouw
,
A. J.
,
M. L.
Janssen-Heijnen
,
J. W.
Coebergh
, and
H. F.
Hillen
.
Epidemiology of unknown primary tumours; incidence and population-based survival of 1285 patients in Southeast Netherlands, 1984-1992.
Eur J Cancer
2002
.
38
(
3
):
409
413
.
43
Jemal
,
A.
,
R.
Siegel
,
E.
Ward
,
Y.
Hao
,
J.
Xu
, and
M. J.
Thun
.
Cancer statistics, 2009.
CA Cancer J Clin
2009
.
59
(
4
):
225
249
.
44
Draft Guidance for Industry, Clinical Laboratories, and FDA Staff: In Vitro Diagnostic Multivariate Index Assays
.
Rockville, MD
US Food and Drug Administration, Center for Devices and Radiological Health
.
July 26, 2007
.
45
Pentheroudakis
,
G.
,
E.
Briasoulis
, and
N.
Pavlidis
.
Cancer of unknown primary site: missing primary or missing biology?
Oncologist
2007
.
12
(
4
):
418
425
.
46
Pentheroudakis
,
G.
,
F. A.
Greco
, and
N.
Pavlidis
.
Molecular assignment of tissue of origin in cancer of unknown primary may not predict response to therapy or outcome: a systematic literature review.
Cancer Treat Rev
2009
.
35
(
3
):
221
227
.
47
Dennis
,
J. L.
,
T. R.
Hvidsten
,
E. C.
Wit
, et al
.
Markers of adenocarcinoma characteristic of the site of origin: development of a diagnostic algorithm.
Clin Cancer Res
2005
.
11
(
10
):
3766
3772
.
48
Bahrami
,
A.
,
L. D.
Truong
, and
J. Y.
Ro
.
Undifferentiated tumor: true identity by immunohistochemistry.
Arch Pathol Lab Med
2008
.
132
(
3
):
326
348
.
49
Schapira
,
D. V.
and
A. R.
Jarrett
.
The need to consider survival, outcome, and expense when evaluating and treating patients with unknown primary carcinoma.
Arch Intern Med
1995
.
155
(
19
):
2050
2054
.
50
Varadhachary
,
G. R.
,
M. N.
Raber
,
A.
Matamoros
, and
J. L.
Abbruzzese
.
Carcinoma of unknown primary with a colon-cancer profile-changing paradigm and emerging definitions.
Lancet Oncol
2008
.
9
(
6
):
596
599
.
51
Ross
,
J. S.
Multigene classifiers, prognostic factors, and predictors of breast cancer clinical outcome.
Adv Anat Pathol
2009
.
16
(
4
):
204
215
.

Author notes

From the Department of Pathology, The Methodist Hospital, Houston, Texas (Drs Monzon and Koen); the Department of Pathology, The Methodist Hospital Research Institute, Houston, Texas (Dr Monzon); and the Department of Pathology, Weill Cornell Medical College, New York, New York (Dr Monzon).

Dr Monzon was principal investigator of the analytic and clinical validation study for the Pathwork Tissue of Origin Test and received funding through a sponsored research agreement with Pathwork Diagnostics, LLC. Dr Monzon has also received honoraria and travel funds from Pathwork Diagnostics for speaking engagements related to the Pathwork Tissue of Origin Test and honoraria for consultation in other topics unrelated to this test. Neither Dr Monzon nor Dr Koen holds equity, employment, or leadership position at any of the companies that manufacture/offer tests for tissue-of-origin determination. Dr Koen has no relevant financial interest in the products or companies described in this article.