Despite widespread use of formalin-fixed, paraffin-embedded (FFPE) tissue in clinical and research settings, potential effects of variable tissue processing remain largely unknown.
To elucidate molecular effects associated with clinically relevant preanalytical variability, the National Cancer Institute initiated the Biospecimen Preanalytical Variables (BPV) program.
The BPV program, a well-controlled series of systematic, blind and randomized studies, investigated whether a delay to fixation (DTF) or time in fixative (TIF) affects the quantity and quality of DNA and RNA isolated from FFPE colon, kidney, and ovarian tumors in comparison to case-matched snap-frozen controls.
DNA and RNA yields were comparable among FFPE biospecimens subjected to different DTF and TIF time points. DNA and RNA quality metrics revealed assay- and time point–specific effects of DTF and TIF. A quantitative reverse transcription–polymerase chain reaction (qRT-PCR) assay was superior when assessing RNA quality, consistently detecting differences between FFPE and snap-frozen biospecimens and among DTF and TIF time points. RNA Integrity Number and DV200 (representing the percentage of RNA fragments longer than 200 nucleotides) displayed more limited sensitivity. Differences in DNA quality (Q-ratio) between FFPE and snap-frozen biospecimens and among DTF and TIF time points were detected with a qPCR-based assay.
DNA and RNA quality may be adversely affected in some tumor types by a 12-hour DTF or a TIF of 72 hours. Results presented here as well as those of additional BPV molecular analyses underway will aid in the identification of acceptable delays and optimal fixation times, and quality assays that are suitable predictors of an FFPE biospecimen's fit-for-purpose.
Formalin fixation and paraffin embedding, first developed for histologic diagnosis more than a century ago, is used for tissue preservation by medical institutions around the world. Despite analytical challenges, molecular analysis of formalin-fixed, paraffin-embedded (FFPE) biospecimens is widespread in both basic research and clinical settings, including current clinical trials that use high-throughput sequencing of FFPE tumor biospecimens (NCT02788084, NCT02090530, NCT02843711, NCT01952275).1
Despite the nearly universal use of FFPE biospecimens in research and clinical settings, it remains unclear how variable and/or suboptimal biospecimen handling practices affect molecular analysis. From previous studies reporting effects of delayed fixation (the time between surgical excision and preservation) and the duration of formalin fixation on immunohistochemistry,2,3 the American Society of Clinical Oncology and the College of American Pathologists released guidelines for immunohistochemical testing in breast cancer biospecimens4,5 while acknowledging the need for additional supporting data. Potential effects of delayed fixation and formalin fixation duration on DNA and RNA endpoints are less clear than those for immunohistochemistry, due in part to a lack of evidence from well-controlled and adequately powered studies. While reports indicate that a delay to fixation (DTF) and time in fixative (TIF) affect DNA and RNA differently, the direction and magnitude of effects are dependent upon the gene/transcript targeted and the analytical platform used. Published studies report, for example, that a DTF of 1 hour adversely affected fluorescence in situ hybridization (FISH) signal,6,7 while effects on next-generation sequencing (NGS), which included declines in the percentage of unique reads and the number of detectable single nucleotide variants (SNVs), were not pronounced until 48 hours and did not result in false-positive SNV calls.8 Most reports agree that overfixation (≥72 hours) may adversely affect both DNA and RNA analysis9 ; however, the magnitude and scope of effects (global versus specific) remain unclear, as do potential effects associated with underfixation. Further complicating matters, nucleic acids from FFPE tissues are often degraded, precluding the use of traditional methods such as RNA Integrity Number (RIN) to determine biospecimen quality. Taken together, reported effects on nucleic acid analysis provide cause for concern, but offer little guidance on suitable fixation and processing procedures.
The field of biospecimen science, which studies how preanalytical variability during biospecimen collection, handling, and storage affects downstream research data, has grown in the last decade due in part to increased concerns over data reproducibility. Two government-funded programs, one in the United States (National Cancer Institute's Biospecimen Research Network, NCI's BRN)10 and one in the European Union (Standardization and improvement of generic preanalytical tools and procedures for in vitro diagnostics, SPIDIA),11 have addressed these issues through systematic research studies of a multidisciplinary nature. NCI designed and initiated the Biospecimen Preanalytical Variables (BPV) program in 2010 to better understand how preanalytical variability affects molecular analysis. The BPV program investigates the effects of DTF and TIF in FFPE biospecimens, as well as preanalytical factors related to freezing and storing tissue and blood biospecimens. This ongoing study performed prospective collection of a well-annotated cohort of matched FFPE and frozen tumor and normal adjacent biospecimens from 4 different tissue types (kidney, ovary, lung, and colon) along with matched blood at 4 different medical institutions. Here we describe the program's experimental design, the infrastructure developed to support the program and harmonize collection and processing practices across multiple institutions, and initial results on how nucleic acid quantification and quality assessment are affected by DTF and TIF. Analysis of the same biospecimen cohort by multiple molecular platforms was performed to assess analyte-specific effects and to determine which quality metrics were suitable predictors of platform performance, following a fit-for-purpose model. This initial article represents the first in a series aimed to elucidate the relationship between DNA, RNA, and protein quality metrics and downstream analytical performance.
MATERIALS AND METHODS
Tasked with identifying major preanalytical factors, tumor types, relevant assays and technologies, and patient and tumor eligibility criteria, a Scientific Steering Committee (SSC) composed of interdisciplinary clinicians and scientists was established in the early part of the BPV program. The SSC recommended that, based on available evidence6,12–26 and perceived research and clinical need for evidence, the initial studies should focus on DTF and TIF. Following the SSC's recommendation, the BPV program chose to collect renal cell carcinoma (kidney), ovarian/fallopian tube carcinoma (ovary), lung adenocarcinoma and squamous cell carcinoma (lung), and colorectal adenocarcinoma (colon) biospecimens based on various considerations including relative cellular homogeneity (to minimize intratumoral heterogeneity27 ), tumor size, and anticipated availability at participating medical centers (Biospecimen Source Sites, BSSs). To minimize variability associated with different clinical exposures, patients who had received chemotherapy, radiation treatment, or immunotherapy for any previous or current cancer diagnosis were excluded from the study. To further reduce potential variability in molecular results, histopathology eligibility criteria were set to require at least 50% tumor by surface area and less than 20% necrosis. The program's inclusion criteria are summarized in Supplemental Figure 1 (see supplemental digital content containing methods and results with 8 figures and 6 tables at www.archivesofpathology.org in the September 2019 table of contents).
Patient Screening, Consenting, and Enrollment
Contracts were competitively awarded to 4 BSSs that were responsible for (1) recruiting, screening, and consenting donors, (2) the prospective collection and systematic processing of selected tumor biospecimens according to BPV-defined experimental protocols, and (3) data collection. An institutional review board (IRB)–approved protocol was used by each BSS for the collection of human biospecimens for research purposes in accordance with the Helsinki Declaration of 1975, as revised in 1983 (Emory University IRB00045796 [approved March 21, 2013]; University of New Mexico IRB00000591 [approved June 28, 2012]; University of Pittsburgh IRB0106147 [approved May 28, 2014], IRB0411047 [approved July 18, 2014], IRB09502110, IRB0506140 [approved May 28, 2014], and IRB056140 [approved June 19, 2014]; and Boston Medical Center IRB00000376 [approved February 05, 2014]). Biospecimens were only collected from patients who met presurgery inclusion criteria (Supplemental Figure 1) and provided written informed consent. A BPV substudy of the ethical, legal, and social implications of biobanking was also conducted at the 4 BSSs and is reported in a separate publication.28
To accommodate processing according to different protocols, individual tissue blocks were dissected from a single piece of tumor to yield 1 FFPE quality control (QC), 1 snap-frozen, and 4 experimental tumor pieces (Figure 1); together these pieces along with their derivative blocks were referred to as a tissue module. To avoid systematic bias and increase statistical significance of the results, experimental tissue pieces within a module were randomly assigned to an experimental protocol (ie, a time point) by using an experimental key randomly generated for each tissue module and provided to BSSs as a hard copy. Also, analytical researchers were blind to the protocol assigned to each biospecimen.
The time points for studying TIF and DTF were designed to be consistent with current clinical practice in medical institutions and to fit within the practical confines of anatomic pathology and tissue banking laboratories at the BSS. For the TIF experimental module, tumor samples were placed into 50 mL of 10% neutral buffered formalin (NBF, Fisher Scientific, Kalamazoo, Michigan) within 1 hour of collection and fixed at room temperature for 6, 12, 23, or 72 hours per protocol A, B, C, or D, respectively (Table). For the DTF experimental module, tumor biospecimens were placed in a cassette and kept in a humidified chamber (100-mL container with gauze dampened with sterile water) for 1, 2, 3, or 12 hours per protocol E, F, G, or H, respectively, then fixed in 50 mL of 10% NBF for 10 to 12 hours at room temperature (Table). Each tissue module also included pieces from the centermost region of the tumor that served as a FFPE QC and a frozen control (Figure 1). The FFPE QC piece experienced a delay of less than 1 hour before fixation at room temperature in 10% NBF for 23 hours, and was designed to provide an initial indication to program pathologists of whether the module met histologic quality criteria. After formalin fixation, all tissue cassettes were processed to paraffin blocks at BSSs by using a dedicated Leica Peloris II tissue processor (Leica Biosystems, Mt Waverly, Australia) and BPV-specified reagents and settings to reduce unintended preanalytical variability (Supplemental Tables 1 and 2). The frozen control was snap frozen in liquid nitrogen for 2 to 30 minutes then stored in liquid nitrogen vapor and was considered an essential gold standard for each module and molecular analysis.
A single contiguous piece of tumor tissue was divided into 6 equally sized aliquots, which were required to process 1 complete experimental module. Required minimum tumor sizes were defined in consultation with local BSS pathologists. For kidney tumors, a minimum of 1.0 cm3 of tissue was necessary for successful enrollment. For lung, colon, and ovary cases, each aliquot had a minimum requirement of 0.33 cm × 0.5 cm × 0.5 cm (L x W x D). FFPE QC and frozen pieces were taken from the centermost region. At the time of tissue procurement, the BSS pathologists performed a gross examination of each tumor after resection and excluded those that were not one of the BPV tumor subtypes or did not appear to have enough tumor tissue to meet the BPV eligibility criteria of at least 50% tumor by surface area and less than 20% necrosis.
Blood samples and, when available, “normal” tissue adjacent to the tumor, were also collected. Additional experimental tissue modules and blood specimens were collected for studies of other preanalytical factors, to be described separately.
A robust infrastructure was developed to support the collection and analysis of BPV biospecimens. A contract for a Comprehensive Biospecimen Resource (CBR) was competitively awarded to the Van Andel Research Institute (Grand Rapids, Michigan) to support various biospecimen management functions including biospecimen collection kit construction, shipping, storage and receipt, histology, and analyte isolation. A customized, Web-based informatics platform, the Comprehensive Data Resource (CDR), was developed and used for the program (https://github.com/NCIP/CDR). The CDR was developed by Leidos Biomedical Research, Inc., to manage consent, clinical, QC, demographic, and biospecimen handling data for NCI biospecimen programs including the BPV and the Genotype-Tissue Expression (GTEx) programs.29–32 The CDR enabled secure data entry by designated personnel at each BPV-affiliated center and controlled user role-specific access to data by program management. More than 300 clinical, demographic, and specimen handling data elements were collected for each BPV case. Figure 2 describes the flow of data and biospecimens within the BPV program infrastructure.
Sample collection kits that included prelabeled blood collection tubes, tissue cassettes, and randomization keys were prepared by the CBR and shipped to each BSS. Customized forms were designed within the CDR to facilitate the execution of the experimental tissue protocols. Real-time data entry through bar code scanning of tissue cassettes was available to BSSs as well as paper data entry as needed. Once the biospecimens were procured and processed at the BSSs, they were shipped to the CBR where they were inventoried, stored, and sectioned. Frozen control tumor samples were stored in the vapor phase liquid nitrogen at the BSSs and subsequently shipped in a liquid nitrogen dry shipper to the CBR for long-term storage in liquid nitrogen vapor. FFPE tissue samples were shipped to the CBR in a styrofoam box with a cold pack and stored at ambient temperatures in a humidity-controlled room upon receipt. The CBR produced hematoxylin-eosin (H&E)–stained tissue sections from each QC and experimental FFPE tissue block. The H&E sections were digitally scanned and images were reviewed by pathologists at a centralized Pathology Resource Center (PRC) established for the program in order to verify biospecimen eligibility with respect to tumor subtype and to assess biospecimen composition or tumor content according to study criteria (Supplemental Figure 1). Specialized electronic forms were developed in the CDR to facilitate PRC reporting and other data entry functions.
The logistical operations and processes existing at each BSS were markedly different, but the program aimed to harmonize these practices across the 4 sites and control preanalytical variation as much as possible without compromising patient care. The BPV program used more than 90 operating documents including an overarching study protocol, all developed from the best practices at each BSS, expert opinion, and relevant published data. These documents, including standard operating procedures (SOPs), have been released to the public concurrent with publication of this article. The program's quality management (QM) team worked closely with each BSS and the CBR to assure that the SOPs were adopted at each site and that each site had implemented a total quality management plan.33 Ongoing training on program SOPs was conducted with staff at each BSS to ensure consistency of practice between sites. Deviations from standard procedures were reported to and evaluated by the QM team. Documentation in the form of raw data, worksheets, electronically captured data, and/or metrics compilations were reviewed by QM for changes/deviations. QM collaborated with BSS management to identify and implement corrective and/or preventive action in order to reduce the frequency of unintended changes/deviations. The QM team conducted facility audits that consisted of systematic and independent examination of the facilities, personnel qualifications/training, processes and practices related to the BPV program. Comprehensive audits of resultant datasets reported by study sites were also conducted to confirm the accuracy, completeness, and validity of data resulting from assays/test methods.
DNA and RNA Isolation and Quality Assessment
The CBR performed nucleic acid isolations from FFPE and frozen tissues. RNA from FFPE blocks was extracted by using the QIAsymphony automated robot with the QIAsymphony RNA Kit (RNA FFPE 130 protocol) (Qiagen, Germantown, Maryland), while snap-frozen samples were extracted with the QIAsymphony RNA Kit (RNA CT 400 protocol). DNA from FFPE tissue sections was extracted by using the QIAsymphony automated robot with the QIAsymphony DNA mini Kit (Tissue LC 200 protocol), while snap-frozen samples were extracted with the QIAsymphony DNA mini kit (Tissue HC 200 protocol).
The Thermo Scientific NanoDrop 8000 UV-Vis spectrophotometer (ThermoFisher Scientific, Wilmington, Delaware) was used to determine the concentration, 260/280 ratio (purity), and 260/230 ratio (contamination) of DNA and RNA samples. The LifeTechnologies Qubit 2.0 Fluorometer (Carlsbad, California) was used to quantify DNA and RNA by using the Qubit dsDNA BR Assay Kit and the Qubit RNA BR Assay Kit. The Agilent Bioanalyzer 2100 (Agilent Technologies, Santa Clara, California) was used to determine RIN and DV200 values of RNA samples. RIN is generated by using an entire electrophoretic trace and algorithm, and values range from 1 (lowest quality) to 10 (highest quality). DV200 value represents the percentage of RNA fragments longer than 200 nucleotides. The KAPA Human Genomic DNA Quantification and QC kit (Kapa Biosystems, Wilmington, Massachusetts), hereafter referred to as the DNA KAPA Assay, was used to determine the quality of DNA samples per the manufacturer's instructions. The assay is quantitative polymerase chain reaction (qPCR) based and generates a Q-ratio based on the proportion of medium (129 bp) or long (305 bp) amplicons relative to a shorter amplicon (41 bp) of a highly conserved, single copy gene. A home-brew FFPE RNA QC assay, based on quantitative reverse transcription–polymerase chain reaction (qRT-PCR) and hereafter referred to as the qRT-PCR RNA QC Assay, was used to evaluate RNA quality. The qRT-PCR RNA QC Assay consisted of 3 primer/TaqMan probe combinations designed to yield 2 different length amplicons (∼80 bp and 165 bp) for the stably expressed genes GAPDH (abundantly expressed gene) and PGK1 (less abundantly expressed gene). Primer sequences (obtained from Integrated DNA Technologies, Skokie, Illinois) and additional details of the qRT-PCR RNA QC Assay are described in Supplemental Methods. Briefly, multiplexed 1-step qRT-PCR reactions were prepared such that similar length products for the 2 genes were amplified within the same reaction. Reactions were run in triplicate for each RNA sample by using 30 ng RNA per reaction, qScript XLT One-Step RT-qPCR ToughMix, and Low Rox (Quanta Biosciences, Beverly, Massachusetts) mastermix on a ViiA7 Real-Time PCR System (ThermoFisher Scientific). Ct (threshold cycle) data for replicates were averaged and quantified for each primer/probe set by using the standard curve generated with Universal Human Reference RNA (Agilent Technologies). Raw Ct values for each triplicate dataset were analyzed by using the Grubb test to identify outliers, which were subsequently removed from the dataset and considered “undetermined.” Datasets for which 2 or more of the 3 Ct data points were “undetermined” were excluded from analysis. RNA quality was assessed by using the Q-score, a ratio of the mean quantity of a medium (∼165 bp) amplicon in relation to a short (∼80 bp) amplicon. A higher Q-score indicates higher-quality RNA.
Quality measurements are depicted as scatterplots organized by protocol and tissue type. The mean and the bootstrap 95% confidence interval for the mean are depicted as a solid horizontal line and the box it lies within, respectively. Linear mixed-effects models with random intercepts were used to estimate differences in the various quality measures by protocol, while accounting for within-sample correlations given biospecimens were collected from the same patient. Outcome measures that appeared to be skewed were log-transformed in the modeling. Models were estimated with maximum likelihood, using the lme4 package34 in R version 126.96.36.199 Unless otherwise stated, P values are from omnibus tests for differences among protocol conditions. P values of omnibus tests were not adjusted for multiple comparisons; owing to the multiplicity of comparisons, a more stringent threshold of P ≤ .001 was used for declaring statistical significance. Mean differences and P values for pairwise differences were determined for a select subset of comparisons. A 2-side P value less than .05 was considered significant for pairwise comparisons. P values for tests of global differences are summarized in Supplemental Table 3.
For clarity within the information management system and labeling scheme for the program, a case was defined as a collection of blood and tissue associated with a single surgical event for a single patient. Matched frozen tissue was considered essential to provide a gold standard for molecular analysis for each module, and matched blood biospecimens were essential for analyses such as array comparative genomic hybridization. The total number of cases screened, consented, and collected is presented in Supplemental Figure 2. Of the more than 4000 cases screened for the BPV study, tissue modules from 203 cases were processed for DTF experiments and 113 for TIF experiments; details are provided in Supplemental Figure 3. The remaining cases were deemed ineligible on the basis of presurgery and postsurgery eligibility criteria or were used to investigate preanalytical variables not addressed in this article (for additional details see supplemental digital content).
Qualification of Tissue Modules for Molecular Analysis
The FFPE QC piece for each tissue module was evaluated to provide an initial indication of whether the module would meet histologic quality criteria for the program. When a single FFPE QC piece was evaluated from each of the 300 cases used in DTF and TIF experiments, 84% met premolecular analysis inclusion criteria that specified a minimum of 50% tumor by surface area and less than 20% necrosis (Supplemental Figure 1).
It was envisioned that the processed BPV tissues would be analyzed by multiple analytical platforms to determine whether effects of DTF or TIF, if any, would be global or specific to particular analytes/platforms. Before commencing molecular analysis, all 4 FFPE experimental tumor blocks from each tissue module were histologically evaluated for cellular composition in an effort to reduce potential variability and minimize the impact of tumor content on molecular analysis results. This extended pathology review revealed that in 59% (176 of 300) of the eligible modules, all 4 experimental tumor blocks met tumor and necrosis requirements (for additional details see supplemental digital content). When QC and experimental tumor blocks were compared with regard to tumor and necrosis requirements, 68% of tissue modules that passed evaluation of the QC block also passed when all 4 experimental blocks were evaluated individually. Conversely, 8% of tissue modules yielded 4 experimental tumor blocks that met tumor and necrosis requirements despite failure of the QC tumor block. Quantifying variability between and within tumor specimens was not possible given the tumor content of each block was binned as a range as opposed to a discrete percentage. However, tumor content distribution among blocks processed under TIF and DTF modules differed among the tissue types examined, as most (73%–84%) kidney blocks had a tumor content of 75% to 100%, while colon and lung blocks displayed a more uniform distribution pattern (Supplemental Figure 4). Data on the tumor content recorded for each block are located in Supplemental Table 5.
Deviations from BPV-supplied SOPs during biospecimen collection and processing were documented and corrective and/or preventive actions were developed and implemented by each BSS. Examples of major and minor SOP deviations can be found in supplemental digital content. While samples from tissue modules affected by minor SOP deviations were included in analysis, those affected by a major deviation were evaluated in terms of their fit-for-purpose for each individual analytical platform and excluded or deprioritized accordingly (for examples see Supplemental Table 4 in supplemental digital content).
Due to insufficient numbers of tissue modules that met all premolecular analysis eligibility criteria (Supplemental Figure 3), lung biospecimens were omitted from DTF and TIF analysis, and colon biospecimens were omitted from TIF analysis. In an effort to achieve a minimum target sample size of 10 modules for each tissue and experiment investigated, several exceptions were made. Given a nearly identical morphology, a single case of fallopian tube carcinoma was included in the TIF ovary sample set (8 cases of ovarian carcinoma) analyzed by RIN and DV200. Nine FFPE experimental blocks (6 colon DTF blocks, 3 ovary TIF blocks) with less than 50% tumor content by surface area and 1 ovary TIF block with greater than 20% necrosis were included in yield and quality metric analyses (Supplemental Table 4). The percentage of experimental blocks with less than 50% tumor content that were included in each dataset was low, reflecting between 0.9% and 2.9% of all experimental blocks analyzed. The exact number of experimental blocks with less than 50% tumor content that was analyzed was unique for each analytical platform and can be found in Supplemental Table 5.
Not all experimental blocks were analyzed by every analytical platform. Individual tissue modules were selected for analysis by specific assays and platforms in order to maximize sample distribution and to ensure adequate sample sizes. Details on tissue type, tumor content, necrosis, eligibility, and assay for each experimental block are located in Supplemental Table 6.
Effects of Delay to Fixation and Time in Fixative on RNA and DNA Quantity
Yields of RNA and DNA from eligible FFPE and frozen tumor biospecimens were calculated from the quantity and the extrapolated tissue volume used for extraction. Comparisons between DNA and RNA yields could not be drawn for frozen and FFPE biospecimens owing to differences in tissue shrinkage introduced during preservation and processing. When FFPE biospecimens were compared, no consistent or significant differences in DNA and RNA yields were observed between different TIF and DTF time points (Supplemental Figures 5 and 6).
Effects of Delay to Fixation and Time in Fixative on RNA Quality
RNA quality was assessed by using several different approaches that included RIN, DV200 value, and a qRT-PCR RNA QC Assay. RINs were significantly and consistently higher for case-matched frozen biospecimens (mean RIN > 7) than for FFPE biospecimens (mean RIN < 3) regardless of tumor type or experimental module (all P values < .001) (Figures 3, A through E). When comparisons were limited to FFPE samples, RIN was not significantly affected by DTF or TIF with the exception of colon blocks processed under DTF protocols (P = .03) (Figure 3, C). Notably, RINs were variable among frozen tumor biospecimens from TIF ovary modules, and DTF and TIF kidney modules.
Mean DV200 values, also generated with an Agilent Bioanalyzer, were comparable for matched frozen and FFPE colon and ovary tumor biospecimens regardless of DTF (Figure 4, A through C). For kidney biospecimens from the DTF module, mean DV200 values differed between frozen and FFPE biospecimens (P = .009), with higher mean DV200 values observed for frozen than for FFPE kidney biospecimens (Figure 4, A). When comparisons were limited to FFPE samples, significant differences in mean DV200 values were observed among kidney tumor biospecimens subjected to a DTF of 12 hours, which were on average 7.7 units higher than the collective mean of the other delays investigated (P = .02). Conversely, mean DV200 values differed between matched frozen and FFPE biospecimens from the TIF module for both kidney (P < .001) and ovary (P = .04) (Figure 4, D and E). When comparisons were limited to FFPE samples, mean DV200 values were significantly lower in kidney tumor samples fixed for 72 hours than the collective mean of 6-, 12-, and 23-hour TIF biospecimens by 35.7 units (P < .001), suggesting that prolonged exposure to formalin (72 hours) significantly affects RNA fragmentation. While the same results were not observed with ovary tumor biospecimens from the TIF experimental module, the sample size of the 2 tumor types differed considerably from one another (kidney, n = 29–31; ovary, n = 9), and correspondingly, so did the bootstrap 95% confidence intervals.
RNA quality was also assessed in the same samples by using a qRT-PCR assay recently developed by the Molecular Characterization Laboratory at the Frederick National Laboratory for Cancer Research. The qRT-PCR RNA QC assay generates a Q-score, which is the ratio of medium (165 bp) to short (80 bp) GAPDH amplicons. Large and significant differences in Q-scores were observed between matched frozen and FFPE samples from both the DTF and TIF experimental modules (P < .001 for all; Figure 5, A through D), consistent with RIN data from these samples. However, in contrast to RIN, Q-scores differed significantly among FFPE DTF time points (P = .03 for colon, P < .001 for kidney and ovary) (Figure 5, A through C). Mean Q-scores were significantly higher for 12-hour DTF tumor biospecimens than the collective mean of 1-, 2-, and 3-hour DTF biospecimens for all tissues investigated (P < .003 for all). Notably, Q-scores from FFPE biospecimens processed under the 12-hour DTF protocol were still remarkably lower than those from matched frozen biospecimens (0.71–0.73 for 12-hour DTF FFPE versus 0.96–1.05 for frozen biospecimens). Similar results were also observed for Q-scores generated with PGK amplicons (Supplemental Figure 7). Contrary to results obtained with RIN, Q-scores differed significantly among FFPE TIF time points for kidney, as the mean Q-score for biospecimens fixed for 72 hours was significantly lower (by 0.20) than the collective mean of biospecimens fixed for 6, 12, and 23 hours (P < .001), suggesting a detrimental effect of extended fixation on RNA quality similar to that indicated by DV200 results.
Effects of Delay to Fixation and Time in Fixative on DNA Quality
DNA quality was determined by the qPCR-based DNA KAPA assay, which generates a Q-ratio based on the proportion of medium (129 bp) or long (305 bp) amplicons relative to a shorter amplicon (41 bp). While data presented in Figure 6 reflect the ratio of medium/short amplicons (Q129/Q41), comparable results were produced for long/short amplicons (Q305/Q41) (Supplemental Figure 8). Mean Q-ratios differed significantly between matched frozen biospecimens and FFPE biospecimens for both DTF and TIF modules regardless of tissue type (P < .001 for all; Figure 6, A through D), with the exception of ovarian specimens from the TIF module (P = .006; Figure 6, E). Higher mean Q-ratios were observed for frozen than for FFPE biospecimens for all tissue types and modules. When comparisons were limited to FFPE biospecimens, mean Q-ratios differed significantly among biospecimens from the DTF module for all tissues examined (colon, kidney, ovary) (P < .001 for all), with higher mean Q-ratios observed for biospecimens subjected to a DTF of 12 hours than for shorter time points. Mean Q-ratios differed significantly among kidney FFPE biospecimens processed under the TIF experimental module (P < .001), with lower Q-ratios observed for biospecimens fixed for 72 hours than for shorter durations, while no differences were observed among ovary tumor FFPE biospecimens fixed for different durations. The sample size of the 2 tumor types differed considerably from one another (kidney, n = 22–24; ovary, n = 3–5) as well as their corresponding bootstrap 95% confidence intervals.
The BPV program represents a significant effort to produce a well-controlled study of preanalytical variation and its effects on nucleic acid integrity and subsequent molecular data using human tissue. Findings represent a collection of biospecimens collected and processed at 4 different medical institutions using harmonized practices, and data generated thus far indicate that nucleic acid quality in human tumor biospecimens is adversely affected by extended formalin fixation; effects are tissue- and assay-specific. Additional studies from the program are in progress and will be published as results become available. A BPV companion study of the ethical, legal, and social implications of biobanking has also been completed.28
Consistent data have been obtained that may prove to be important in understanding the molecular effects of different FFPE regimens and evaluating current clinical practices for processing FFPE tissues. A major finding of this study is that prolonged exposure to formalin (72 hours) can significantly decrease RNA quality as measured by 2 RNA fragmentation assays: (1) DV200, which assesses the percentage of RNA fragments greater than 200 nucleotides in a RNA sample, and (2) a qRT-PCR–based system (qRT-PCR RNA QC assay) that assesses quality via ratio analysis of nucleotide length. Interestingly, RIN values do not reveal these differences in RNA fragmentation, raising the question of the usefulness of RIN as a quality metric for FFPE.36–38 The qPCR-based KAPA assay demonstrated that DNA quality was also significantly reduced after prolonged exposure to formalin. Notably, these significant declines in RNA and DNA quality were observed with kidney tumor biospecimens. While we cannot exclude the possibility that effects on RNA and DNA quality with these platforms are specific to kidney, as ovary biospecimens failed to display significant differences following 72 hours in formalin, sample size of TIF kidney biospecimens was 4 times larger than the number of TIF ovary biospecimens. TIF results presented here are in agreement with the existing literature reporting adverse effects on the intensity and size of PCR amplicons39,40 as well as immunohistochemical staining16,41–43 following formalin fixation for 72 hours or longer. The published literature also reports that formalin fixation for longer than 24 hours was associated with an increased frequency of estrogen receptor- and progesterone receptor-negative results by immunohistochemistry2 and reports conflict as to whether qRT-PCR is or is not adversely affected by formalin fixation for 24 to 48 hours.44,45 Immunohistochemical analysis of the same BPV biospecimen collection is currently underway for several clinically relevant biomarkers.
The time points chosen for the DTF experimental module (1, 2, 3, and 12 hours) were designed to encompass a typical range of delays to formalin fixation for surgically resected tumors that are routinely sent to the Pathology Grossing Laboratory. Results from the DTF module indicate that a DTF of up to 3 hours does not significantly affect RNA and DNA quality when using the metrics investigated. Others have reported similar findings in FFPE tissue for RNA, as a DTF of 2 hours did not affect bioanalyzer-generated electropherogram patterns46 and a DTF of 12 hours did not affect levels of select transcripts.47–49 Similarly, the quality of DNA isolated from FFPE tissue subjected to a DTF of up to 12 hours did not affect qPCR amplification,49,50 although reductions in FISH and ISH signals have been reported after delays of 2 hours6,50 or more than 3 hours.51 One surprising finding was that FFPE biospecimens subjected to a DTF of 12 hours tended to have significantly higher average Q-ratios and Q-scores for DNA and RNA, respectively, than biospecimens that experienced a 1-, 2-, or 3-hour delay. This finding was unexpected as we had hypothesized that the longest DTF time point would result in more RNA and DNA degradation and lower Q-scores and Q-ratios due to extended exposure to RNases and oxidative factors before fixation. We considered whether the higher Q-ratios and Q-scores associated with 12-hour DTF samples may be the result of degraded RNA, as insufficient template amounts would lead to higher Ct values out of the assay range, and result in imprecise stochastic measurements of the amplicons. However, further evaluation revealed that RNA Ct values for most 12-hour DTF samples were within the standard range of this assay. While the source of this anomaly is presently unclear, analysis of the same biospecimen sample set by more sensitive molecular assays, such as RNAseq and NGS, may prove to be informative. Conversely, while RINs of colon biospecimens subjected to a 12-hour DTF were modestly, albeit significantly, lower than those of other DTF time points, DTF-related differences in RINs were not observed in kidney or ovary biospecimens perhaps reflecting differences in RNAse activity between tissue types.
Data from this initial study also clearly demonstrate that formalin fixation and paraffin embedding has a remarkable effect on RNA quality when compared to snap freezing. Indices of RNA quality were consistently and significantly lower for FFPE than matched frozen biospecimens when assessed by RIN and DV200. Q-scores generated with the qRT-PCR RNA QC assay, however, were comparable among FFPE and snap-frozen biospecimens, suggesting that despite widespread changes affecting the electrophoretic trace of a sample, the integrity of individual transcripts remains stable for both abundantly expressed (GAPDH) and less abundantly expressed (PGK; Supplemental Figure 6) genes. Interestingly, DNA quality determined by the qPCR-based KAPA assay was also adversely affected by formalin fixation and paraffin embedding, as lower Q-ratios were consistently observed among FFPE biospecimens relative to snap-frozen controls for all tissue types investigated and for both DTF and TIF modules. Differences in distribution were also observed between quality assays, as samples were quite tightly clustered for FFPE biospecimens for RIN, Q-score, and Q-ratio, while DV200 values exhibited a greater degree of variability between biospecimens. However, a wider distribution of DV200 values is not surprising given the differences in scope between the quality assays investigated. Whereas RIN, Q-score, and Q-ratio are based on quantifiable levels of 2 specific quality markers, DV200 values reflect the percentage of RNA fragments greater than 200 bp in length, which could represent innumerable markers. In fact, the broader scope of the DV200 assay and the wider distribution of values obtained with FFPE biospecimens potentially make it a more meaningful predictor of FFPE sample performance for whole genome and transcriptome analyses such as NGS and RNA sequencing. Although FFPE biospecimens are commonly used for various DNA and RNA applications due to their widespread availability for research and clinical care, clinically relevant questions remain, such as which of these different quality assays can accurately predict successful performance of FFPE-derived DNA or RNA on particular molecular platforms, and how accurate are the results from the derived data? Future studies under the BPV program, such as results of DNA and RNA sequencing of the same biospecimen collection, which are currently in preparation for publication, may shed light on these questions.
No clear effect of DTF or TIF on DNA or RNA quantity was observed, as yields from DTF and TIF modules were randomly distributed. Differences in processing and tissue shrinkage, and the need to measure the volume of tissue used for each extraction, precluded comparisons between frozen and FFPE biospecimens. However, it is possible that the content of tumor and necrotic tissue within each block, as well as other factors, could potentially impact final nucleic acid quantities.
Given that effects of preanalytical variability can be dependent upon the tumor type, analyte, gene/transcript, and platform used for investigation, it is important that each preanalytical factor of interest be evaluated in different biospecimen types and potential effects on analytes be explored by using different molecular assays to truly assess “fit for purpose” utility and accuracy. Further, alterations in gene expression and transcription at the single cell level would not be detected by using the DNA and RNA quality metrics investigated. Analysis of DTF and TIF tissue modules using molecular and proteomic assays beyond those described here are being conducted to determine if effects are global or specific to particular analytes/platforms. The initial DNA and RNA QC data presented here will be compared with data that are currently being generated with other DNA and RNA platforms such as array comparative genomic hybridization, gene expression profiling, and NGS. As often as possible and practical, the same cases and the same pools of analyte will be used for all DNA and RNA platforms. Therefore, in addition to understanding how preanalytical factors affect biospecimen molecular integrity, these data will also help to determine which QC assays are the best indicators of sample performance on specific analytical platforms.
The BPV study was challenging to execute, requiring standardized procedures across multiple medical institutions and complex experimental protocols. While the rate of patient consent was high (90% of eligible patients consenting to participate), obtaining sufficient numbers of tumor tissue modules for molecular analysis proved challenging, as fewer than 30% of 4000 patients screened for the program met all eligibility requirements (no cancer treatment before surgery, sufficient tumor size, primary tumor, tumor type and subtype). Once a case was deemed eligible and available for collection at a medical institution, a number of other deviations could occur during tissue collection, processing, and storage that resulted in the block or module being excluded from analysis. Due to low numbers of colon and lung tissues, these tissues were prioritized for DTF experiments, a decision that was based on the literature evidence available at the time and a greater perceived clinical need. While the constraints imposed on each institution strengthened the resultant data, a major lesson learned is that even with precise and detailed SOPs and imposed training requirements, SOP deviations should be expected. Having protocols for reporting and documenting deviations and including auditing and QC measures are critical for the success and tracking of biospecimen collections. An additional contributing factor limiting collections was unforeseen intrainstitutional competition for biospecimens by independent tissue collection programs at some BSSs. Although modules were prescreened for histologic quality criteria, using a representative QC block, the requirement that all blocks associated with each tissue module meet the same criteria for tumor content and necrosis further reduced the number of tissue modules qualified for molecular analysis by 40%. Such a large reduction in qualified tissue modules suggests that evaluation of a QC block alone is insufficient when verifying tumor and necrosis. While we were unable to address heterogeneity in tumor content within a tumor, different distribution patterns were observed among the tissue types examined, with a disproportionately higher number of kidney blocks with a percentage tumor content of 75% to 100%.
This article represents the first of a series of articles that will highlight and make publicly available the findings from NCI's BPV research program. Initial findings of BPV experiments indicate that overfixation and delay to fixation affect DNA and RNA quality. A comparison of RNA quality metric assays revealed that not all assays were capable of detecting TIF- and DTF-related effects, or differences between snap-frozen and FFPE biospecimens. While studies under the BPV Program were designed to explore practical challenges encountered in clinical settings, recommendations regarding DTF and TIF thresholds as well as the predictive accuracy of the quality assays evaluated require careful consideration of results from molecular assays such as RNA and DNA sequencing, array comparative genomic hybridization, and proteomics such as immunohistochemistry and mass spectrometry. Results of these analyses are currently being prepared for publication. Additional important questions remain, including identifying thresholds of effect and what quality metric assays are the best predictors of biospecimen performance. The results of these studies, like the data reported in this article, will continue to build the body of knowledge on the fitness of FFPE tissues for various molecular platforms and lend support for standardization efforts to reduce FFPE variability to ultimately improve the quality and reproducibility of molecular data. NCI's Biorepositories and Biospecimen Research Branch (BBRB) is preparing an annotated FFPE procedural document, or Biospecimen Evidence-Based Practices,52 that will incorporate the BPV results as well as other data reported in the literature and summarized in the NCI Biospecimen Research Database (https://biospecimens.cancer.gov/brd). As evidence builds that certain FFPE-processing parameters, such as prolonged formalin fixation of 48 to 72 hours, may compromise molecular analysis, medical institutions may wish to explore ways to improve practices. Such evidence-based practice improvements will be important as the use of FFPE tissues in research continues to expand and molecular analysis becomes more important in routine patient diagnosis.
To provide transparency into the collection procedures for BPV, quality management documents including SOPs are available on BBRB's Web site (https://biospecimens.cancer.gov/programs/bpv/bpv_sops.asp) and in the Biospecimen Research Database (http://biospecimens.cancer.gov/brd). Data generated from the BPV study is available in the database of Genotypes and Phenotypes (dbGaP; https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001304.v1.p1) under controlled access (study accession phs001304) to enable researchers to further investigate the BPV data. Research tissues, including tissue microarrays, and blood products from the program are available to researchers through an NCI Collaborative Announcement (posted at: https://biospecimens.cancer.gov/programs/bpv/default.asp).
The Biospecimen Preanalytical Variables Program Members are as follows: Biswajit Das, PhD (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Corrine Camalier, MPH (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Robin Burges, MA (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research; currently with Centrexion Therapeutics Corp); Anna Smith, AAS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Kimberly M. Elburn, MS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Tanya Krubit, MS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Negin Vatanian, MS, MBA (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Takunda Matose, PhD (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research; currently with the Department of Philosophy at Vanderbilt University); Debra Bradbury, BS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); John Seleski, MA (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research; currently at iDoxSolutions, Inc); Charles Shive, BS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Liqun Qi, MS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Pushpa Hariharan, MS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Steven Hunter, BS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Jeff McLean, MS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Karna Robinson, MPH (Leidos Biomedical Research, Inc. Frederick National Laboratory for Cancer Research); Erin Gover, MS (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research; currently at The Emmes Corporation); Jasmin Bavarva, PhD (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); David Tabor, BA (Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research); Dan Rohrer, MBA (Van Andel Research Institute); Dana Valley, BA (Van Andel Research Institute); Galen Hostetter, MD (Van Andel Research Institute); Eman Dosunmu-Doriney, BS (Emory University); Charles Butler, MBA (Emory University); Brian Burnes, BS (Emory University); Ke Yu, BA (Emory University); Kelly Higgins, PhD (University of New Mexico); Cathleen Martinez, BS, HTL(ASCP), PA (University of New Mexico); Fred Schultz, MA (University of New Mexico); Cheryl Spencer, MS (Boston Medical Center); Molly Lurie-Marino, BS (University of Pittsburg); Andrea Chavlovich, RN, BS (University of Pittsburg); and Anthony Green, MT(ASP) (University of Pittsburg).
The program team would like to thank the research participants whose generous donation of biospecimens made this study possible. We thank the members of the Scientific Steering Committee (SSC): Jennifer Hunt, MD (Massachusetts General Hospital; currently at University of Arkansas for Medical Sciences); Denise Bland-Piontek, CTBS(AATB)HTL(ASCP)QIHC (Massachusetts General Hospital); Kristin Ardlie, PhD (Broad Institute of Harvard and Massachusetts Institute of Technology); Steven Skates, PhD (Massachusetts General Hospital and Harvard Medical School); Peggy Devine, BS, CLS (Cancer Information and Support Network, Inc.); Paul Fearn, PhD, MBA (Fred Hutchinson Cancer Research Center; currently at the National Cancer Institute); Andrea Ferreira-Gonzalez, PhD (Virginia Commonwealth University); Mark Wick, MD (University of Virginia Health System); David Hicks, MD (University of Rochester Medical Center); Dan Liebler, PhD (Vanderbilt University); Elizabeth Mansfield, PhD (US Food and Drug Agency; currently at Grail, Inc.); Terry Speed, PhD (University of California at Berkeley; currently at the Walter and Eliza Hall Institute of Medical Research, Australia); Janet Warrington, PhD (PhyloTech, Inc.; currently at Centrillion Technologies); and Mitch Gail MD, PhD (National Cancer Institute). We thank the following current and former members of the NCI team for their contributions: Carolyn Compton, MD, PhD, Jim Vaught, PhD, Sherilyn Sawyer, PhD, Nicole Lockhart, PhD, and Lokesh Agarwal, PhD. For their contributions to the BPV Program we also thank extended team members at Leidos Biomedical Research, Inc., Emory University, University of New Mexico, University of Pittsburgh, Boston Medical Center, and the Van Andel Research Institute.
Supplemental digital content is available for this article at www.archivesofpathology.org in the September 2019 table of contents.
Dr Carithers is currently affiliated with the National Institute of Dental and Craniofacial Research, National Institutes of Health. Dr Odeh is currently affiliated with the National Cancer Institute Center to Reduce Cancer Health Disparities. Dr Sachs is currently affiliated with Karolinska Institutet, Stockholm, Sweden. Dr Barcus is currently affiliated with the United States Food and Drug Administration. Mr Fombonne is currently affiliated with Stony Brook University School of Medicine, Stony Brook, New York. Dr Bocklage is currently affiliated with the University of Kentucky College of Medicine.
The authors have no relevant financial interest in the products or companies described in this article.