Genomic medicine requires the identification of biomarkers and therapeutic targets, which in turn, requires high-quality biospecimens. Achieving high-quality biospecimens requires implementing standard operating procedures to control the variations of preanalytic variables in biobanking. Currently, most biobanks do not control the variations of preanalytic variables when collecting, processing, and storing their biospecimens. However, those variations have been shown to affect the quality of biospecimens and gene expression profiling.
—To identify evidence-based preanalytic parameters that can be applied and those parameters that need further study.
We searched the Biospecimen Research and PubMed databases using defined key words. We retrieved and reviewed 212 articles obtained through those searches. We included 58 articles (27%) according to our inclusion and exclusion criteria for this review.
—Preanalytic variables in biobanking can degrade the quality of biospecimens and alter gene expression profiling. Variables that require further study include the effect of surgical manipulation; the effect of warm ischemia; the allowable duration of delayed specimen processing; the optimal type, duration, and temperature of preservation and fixation; and the optimal storage duration of formalin-fixed, paraffin embedded specimens in a fit-for-purpose approach.
Genomic medicine treats diseases based on prognostic and predictive biomarkers and therapeutic targets identified through DNA sequence analysis and gene expression profiling of diseased tissues—that is, through biospecimens. To reflect the true genomic changes of disease, gene expression profiling requires high-quality biospecimens, which are those that most closely resemble the tissue before its removal from the human body. To achieve that goal, biobanks need to integrate systems of consenting, annotating, collecting, processing, storing, and distributing biospecimens using unified standard operating procedures (SOPs).
Currently, both within and across institutions, unified SOPs in biobanking are lacking. Because of the lack of unified SOPs, the preanalytic variables in biobanking are not well controlled. However, fluctuations in those variables have been shown to affect the quality of biospecimens and gene expression profiling. Furthermore, the lack of unified SOPs in biobanking has in part led to irreproducible experimental results,1 difficulty in comparing and validating research findings,2 and investigator's concerns about research findings because of the poor quality of biospecimens.3 For example, a survey report by Prinz et al4 showed that almost two-thirds of the published data on therapeutic targets could not be reproduced. Reports by the RAND Corporation5 (Santa Monica, California) indicated that more than 300 million biospecimens were collected and stored in various institutions in the United States in 1999 alone, but the lack of unified SOPs in consenting, annotating, collecting, processing, and storing made it difficult to compare and validate test results using those biospecimens.6 The lack of proper consent and standard annotation of biospecimens has limited the value of that vast resource. However, the government and various organizations both in the United States and abroad have published guidelines and recommendations for biobanking. The Office for Human Research Protections of the Department of Health & Human Services (Washington, DC) and the National Cancer Institute (Bethesda, Maryland) has issued recommendations on legal and ethical aspects of consenting for biobanking.7,8 The National Cancer Institute, the College of American Pathologists (Northfield, Illinois) Diagnostic Intelligence and Health Information Technology Committee, and the International Society for Biological and Environmental Repositories (ISBER; Vancouver, British Columbia, Canada) have developed guidelines on annotation of biospecimens.9–11 Furthermore, both the National Cancer Institute and ISBER have published guidelines on best practices of biobanking.8,12 However, those guidelines do not provide the specific parameters that are needed to establish SOPs for each variable. Defining specific parameters for each variable would require evidence-based biospecimen science.
Here, we reviewed studies of preanalytic variables in the collecting, processing and storing biospecimens on their quality and their effect on gene expression profiling using DNA or RNA as analytes. The variables included warm ischemia, surgical manipulation, cold ischemia/delayed specimen processing, preservation at low temperature, preservative and fixative types, preservation and fixation duration and temperature, freeze-thaw cycles, and storage duration. Our goal in this review is to identify evidence-based parameters on preanalytic variables that can be used now and those that require further study to improve the quality of biospecimens, and thereby, to enhance the accurate identification of biomarkers and therapeutic targets in genomic medicine.
REVIEW OF THE LITERATURE: INCLUSION AND EXCLUSION CRITERIA
We searched the Biospecimen Research Database (http://biospecimens.cancer.gov/brd) and PubMed (http://www.ncbi.nlm.nih.gov/pubmed/) for published literature. The Biospecimen Research Database, a freely accessible database of the National Institutes of Health (Bethesda, Maryland), contains “peer-reviewed literature pertinent to the field of human biospecimen science.” One can search the database using key terms within the categories of analyte, technology platform, type of biospecimen, and normal or cancerous tissue. More than 2000 published articles were collected in the database as June 1, 2014. The database is periodically updated, although the frequency of that update is not specified. We searched this database using the terms DNA sequencing, polymerase chain reaction (PCR), real-time quantitative polymerase chain reaction, real-time quantitative reverse transcriptase polymerase chain reaction (qRT-PCR), reverse-transcriptase polymerase chain reaction (RT-PCR), single nucleotide polymorphism assay. We used the terms biobank, biorepository, and biospecimen for the PubMed (National Center for Biotechnology Information, Bethesda, Maryland) search. We did not include search terms to retrieve studies using immunohistochemistry or in situ hybridization in this review because those topics have been recently reviewed elsewhere.13,14 We retrieved and reviewed 212 articles obtained through these searches. We then excluded articles that met one or more of the following criteria: (1) studies that were published before 1998, (2) studies that did not use tissue from the same specimen for comparison, (3) studies that used assays or reagents that were developed or used only in that laboratory or institution, (4) studies that used nonhuman tissue specimens, (5) studies that did not specify the actual changes, and (6) studies that compared different commercial DNA or RNA extraction kits. We included 58 articles (27%) published between January 1998 and April 2014 in this review.
TISSUE SPECIMENS
Warm Ischemia
Warm ischemia, which occurs when blood vessels to an organ are ligated during surgery, can affect gene expression profiling without affecting RNA quality. One study15 compared the gene expression profiles of specimens collected at the intraoperative exposure of the prostate (in situ) with those of specimens collected immediately after resection (ex vivo). The level of mRNA expression in 8 (EGR1, p21, KRT17, PIM1, S100P, TNFRSF, WFDC2, and TRIM29) of 91 cancer-associated genes (9%) increased at least 2-fold, even though the RNA quality measured by the 28S to 18S ratio was not affected.15 Likewise, using lung cancer specimens collected at the chest opening and immediately after resection, 1% of the genes (eg, TNF, IL6, and FOS) differed by more than 2-fold.16 Therefore, to avoid the effect of warm ischemia on gene expression profiling, collecting biospecimens preoperatively has been suggested as the optimal method.15
Surgical Manipulation
The extent to which surgical manipulation affects gene expression profiling needs further investigation. A study17 of surgical manipulation compared radical retropubic prostatectomy specimens collected immediately after midline incision (in situ) with those collected immediately after surgical resection (ex vivo). The expression levels of 41 transcripts increased by 2-fold or more; those transcripts included genes for acute-phase response proteins (IER2 and JUNB) and regulators of cell proliferation (p21Cip1 and KLF6). However, the increased gene expression may have been due to surgical manipulation and warm ischemia rather than surgical manipulation per se. Nevertheless, another study found that the greatest change in gene expression was from the time of intraoperative exposure of the prostate to the ligation of the dorsal vein complexes,15 suggesting an effect of surgical manipulation. In contrast, the gene expression profile did not differ in specimens collected using 2 types of prostatectomy procedures (robot-assisted laparoscopic prostatectomy and radical retropubic prostatectomy), although the protein levels differed significantly on tissue microarrays by immunohistochemical analysis.18 Overall, these findings suggest that to study the effect of surgical manipulation, the confounding effect of warm ischemia needs to be controlled.
Cold Ischemia and Delayed Specimen Processing
Cold ischemia occurs when tissues or organs within or removed from the human body are allowed to cool before being preserved. We combined the reviews of cold ischemia and delayed specimen processing because a definitive time point for distinguishing between them is difficult to determine from the literature.
Cold ischemia/delayed specimen processing can affect the quality of nucleic acid and the expression of genes and proteins. The expression levels of 5% of genes in lung cancer altered at least 2-fold after a 30-minute processing delay.16 Similarly, mRNA expression differed more than 2-fold in 2.3% of the genes in colorectal cancer after a 30- to 120-minute processing delay, and the changes started after only 15-minute processing delay.19 Likewise, the number of altered genes in breast cancer increased with the increasing duration of processing delay, from 0.76% of the genes after a 2-hour delay to 4.1% after a 24-hour delay.20 A biobank study21 compared gene expression profiling in biospecimens before and after the deployment of SOPs, which resulted in more than twice the number of biospecimens cryopreserved within 30 minutes. The study found that the mRNA expression of c-MYC and ER and the estrogen receptor protein level decreased with increasing duration of processing delay,21 demonstrating that unified SOPs in biobanking are needed to ensure meaningful comparison and validation of test results. However, others22,23 have found that the changes in RNA quality and gene expression from delayed specimen processing are insignificant. These findings may be explained by using the mean of the changes of gene expressions,21 because the changes can be either increased or decreased in different genes or with a small sample size.22 Further study of the effect of processing delay on gene expression profiling is warranted. Nevertheless, these findings suggest that delayed specimen processing can be a confounding factor in the expression of genes and proteins and that standardizing the duration of processing delay could minimize the variations in gene expression profiling.
Preservation at Low Temperature
Various fast-freeze methods have been used to preserve fresh biospecimens, including snap-freezing specimens in liquid nitrogen, embedding specimens in optimal cutting-temperature medium with immersion in −80°C isopentane, and freezing specimens using the carbon dioxide quick-freeze method. All these techniques yielded similar quantities of nucleic acids and proteins and had similar PCR and RT-PCR performance,24 suggesting that results obtained using biospecimens preserved with these freezing methods can be meaningfully compared and validated.
On the other hand, preserving fresh tissue specimens at 4°C overnight yielded nucleic acids and protein of similar quality to that from snap-frozen specimens,25 suggesting that fresh specimens can be kept at 4°C if a short delay in processing is anticipated.
Preservatives, Fixatives, Duration, and Temperature of Preservation and Fixation
Optimal fixation of biospecimens depends on 3 variables at a fixed temperature, namely, tissue thickness, the ratio of tissue to fixative volume, and fixation time.26 Formalin fixation of biospecimens leads to fragmentation of nucleic acids.23,27,28 Hewitt et al26 found that the length of fixation time, when the other variables were controlled, affects the quality of nucleic acids. They have recommended fixation time of 6 to 18 hours for biopsy specimens and 12 to 36 hours for surgical specimens to ensure the quality of nucleic acids. Others have suggested that 8 to 16 hours of formalin fixation at ambient temperature is optimal.23
Specimens fixed using 70% ethanol or alcohol-based, noncross-linking fixatives yielded a higher quality of nucleic acids and better PCR performance than did those fixed with formalin,28,29 indicating that alcohol-based fixatives can be a useful alternative to formalin.
RNALater, the newer tissue preservative, may be a better choice than formalin fixation or even snap freezing for preserving tissue for RNA studies. Tissue specimens collected into RNALater (Ambion, Austin, Texas; Ambion, Foster City, California; Qiagen, Germantown, Maryland; Qiagen, Crawley, West Sussex, United Kingdom) before being snap frozen or stored at 4°C yielded better-quality RNA and gene expression profiling than did matched, non-RNALater, snap-frozen specimens or formalin-fixed, paraffin-embedded (FFPE) specimens.16,19,30–32 However, other researchers33 have not found a difference in the length of the amplicons among specimens that had been preserved in RNALater (R 0901, Sigma Company, St Louis, Missouri), acetone (00341-10-65, Reanal, Budapest, Hungary), or formalin. These findings suggest that the types of preservatives and fixatives as well as the duration of preservation and fixation need to be further studied and standardized to ensure the accuracy of gene expression profiling.
Freeze-Thaw Cycles
Freeze-thaw cycles can affect the quality of RNA and alter gene expression profiling, phosphoprotein levels, and enzymatic activity.31,34 These effects depend more on the total thaw time at ambient temperature than on the number of freeze-thaw cycles. A total thaw time of less than 30 minutes at ambient temperature did not affect RNA quality, regardless of the number of freeze-thaw cycles, and any changes in gene expression corresponded to the degradation of RNA.34 In addition, preserving specimens in RNALater (Ambion, Foster City, California) alleviated the effect of thawing on RNA quality.31,34 These findings suggest that degradation of RNA occurs primarily at ambient temperature and that it takes about 30 minutes to degrade significantly to affect gene expression.
Formalin Fixation and Paraffin Embedding
Formalin fixation and paraffin embedding, the most commonly used method of processing biospecimens, involves many variables. Hewitt et al26 provided a good review and recommendations on standardizing the variables. FFPE specimens yield a lower proportion of amplifiable nucleic acids due to fragmentation, and higher false-negative and false-positive rates of mutation detection than are found in matched snap-frozen specimens.35–39
Gallegos et al37 compared the success rate of PCR amplification using genomic DNA extracted from paired FFPE and snap-frozen lung-cancer specimens. They amplified EGFR exons 18 to 21 and KRAS exons 1 and 2 and found that 100% of snap-frozen specimens were amplified, whereas the success rate of amplification in FFPE specimens varied from 19% to 72% (median, 43.5%), with increased success rates in shorter amplicons (success rate increased from 19% to 61% by reducing the amplicon size from 295 base pairs [bp] to 235 bp). Another study40 compared the mutation-detection rate of KRAS exon 2 in paired frozen and FFPE colorectal-cancer specimens. The discordant rate between frozen and FFPE specimens was 9% and 12% using high-resolution melting analysis and direct DNA sequencing, respectively. Likewise, false-positive rates of 10.5% and false-negative rates of 28.9% were found for VHL mutations of clear cell renal cell carcinoma using FFPE specimens.39 In detecting gene rearrangement of T-cell receptor γ, the discordant rate between frozen and FFPE specimens was 32%.35 Furthermore, FFPE specimens for solid-phase, direct DNA sequencing resulted in one false mutation per 500 bases.41 The false mutations caused by FFPE were primarily C>T or G>A transitions.40,41 However, the high false-positive and false-negative mutation rates from FFPE can be overcome using a high depth of coverage with next-generation sequencing technologies.42–44 Overall, a concordance of gene expression profiling between FFPE and snap-frozen specimens can be achieved in amplicons shorter than 200 bases.33,37
In contrast, array-based genotyping platforms produced comparable results for copy number alteration, single nucleotide variation, and loss of heterozygosity between FFPE and snap-frozen specimens.45,46 In addition, FFPE specimens were well correlated (r = 0.80) with snap-frozen specimens in microRNA microarray expression profiling.47 Nevertheless, standardization of the process will reduce the variability of FFPE specimens, making the most-available, feasible, and economically efficient FFPE specimens an invaluable resource.
Storage Duration and Temperature
The storage duration of FFPE specimens can affect the quality of nucleic acids and gene expression profiling, but the effect is less on microRNA. When 2-year-old FFPE specimens were compared with matched non-FFPE specimens, the gene signals above the backgrounds were reduced 4-fold.16 The FFPE specimens stored for 15 years failed RT-PCR amplification.48 The FFPE specimens stored for 7 years, however, had not significantly altered microRNA expression,49 although a gradual loss of expression was found in those microRNAs that were expressed at low levels and in older (11-year-old) specimens.50 Therefore, aged FFPE specimens that are not fit for RNA or DNA studies may still be fit for microRNA studies. Future studies should investigate the age parameters in this fit-for-purpose approach. Furthermore, reporting the age of FFPE specimens in gene expression profiling may improve the comparison and validation of results.
Whether the temperature and humidity of FFPE storage facilities affect gene expression profiling is unknown. The studies we reviewed did not specify the temperature or humidity of these facilities. The National Cancer Institute's best-practices guidelines recommend that FFPE specimens be stored at a temperature below 80°F (27°C), with humidity and pest control.8
BLOOD SPECIMENS
Table 2 summarizes the characteristics of the studies that used blood specimens.
Storage Duration at Ambient Temperature
Blood specimens are routinely collected in ethylenediaminetetraacetic acid (EDTA) or heparinized tubes. Prolonged storage of blood specimens in those tubes can affect gene expression profiling in a time-dependent manner. Storage of blood specimens in EDTA tubes (Vacutainer system, Becton, Dickinson, and Company, Heidelberg, Germany) at ambient temperature significantly altered the expression of β-actin, cytokeratin-19, GAPDH, HER2, and EGFR. The time interval required to reach an effect differed for each gene, with a significant decrease in the expression of cytokeratin-19 and HER2 after 4 hours, β-actin after 6 hours, GAPDH after 24 hours, and a significant increase in the expression of EGFR after 24 hours.51 However, another study52 of blood specimens in EDTA tubes (Vacutainer, Becton, Dickinson, and Company, Plymouth, Devon, United Kingdom) at room temperature with different time intervals using qRT-PCR for GAPDH found that the threshold cycle values increased at 24 and 30 hours, but the differences were not statistically significant. Yet, storage of blood specimens for 48 hours or longer at ambient temperature resulted in splicing variants of PTEN,53 and the loss of exon 20 of the ATM gene.54 Moreover, another study found that 7-day storage at ambient temperature elevated the expression of IL-6 and TNFα 20-fold.55 However, detection of the BCR/ABL fusion transcript did not differ in peripheral blood and bone marrow aspirate specimens stored at ambient temperature for up to 96 hours.56 These findings indicate that the effect of storage duration at ambient temperature on gene expression profiling is dependent on specific genes or the mutation type, suggesting that every effort should be made to minimize delay in specimen processing. Unified SOPs to standardize the time interval of blood specimen processing would minimize the variations of test results both within and across institutions.
PAXgene Collection Tubes
Blood specimens collected in PAXgene tubes had higher RNA quality and less variation in gene expression profiling than did those collected in EDTA tubes.57,58 However, long-term storage of blood specimens in PAXgene tubes can degrade RNA quality. Kim et al59 suggested that blood specimens collected in PAXgene tubes (PreAnalytix, Qiagen, Valencia, California) should not exceed 1 day at ambient temperature, 4 days at 4°C, or 3 months at −20°C. Another study60 showed that the storage duration and temperature of blood specimens in PAXgene tubes (PreAnalytix, Qiagen, Valencia, California) contributed to 0.09% of the variation in RNA expression. Nevertheless, the evidence suggests that storage duration and temperature, as well as the type of collection tubes for blood specimens, should be standardized and reported to ensure the accuracy of results.
CONCLUSIONS AND FUTURE PERSPECTIVES
Preanalytic variables in biobanking affect not only the quality of nucleic acid but also gene expression profiling. To ensure the accuracy of test results and to validate those results, implementing unified SOPs in biobanking to control those variables becomes imperative in the era of genomic medicine.
Further studies are needed to determine the effect of surgical manipulation on gene expression profiling, accounting for the confounding factors of warm ischemia; the allowable duration of delayed specimen processing; the optimal type, duration, and temperature of preservation and fixation; and the optimal storage duration of FFPE specimens in a fit-for-purpose approach.
We thank Stanley R. Hamilton, MD, and David A. Wheeler, PhD, for their critical reading of this manuscript. J.H.Z. was supported by NIH grant T32CA163185.
References
Author notes
The authors have no relevant financial interest in the products or companies described in this article.