Context.—DNA sequencing is the method of choice for mutation detection in many genes.

Objectives.—To demonstrate the analytical accuracy and reliability of DNA sequencing assays developed in clinical laboratories. Only general guidelines exist for the validation of these tests. We provide examples of assay validation strategies for DNA sequencing tests.

Design.—We discuss important design and validation considerations.

Results.—The validation examples include an accuracy study to evaluate concordance between results obtained by the newly designed assay and analyzed by another method or laboratory. Precision (reproducibility) studies are performed to determine the robustness of the assay. To assess the quality of sequencing assays, several sequence quality measures are available. In addition, assessing the ability of primers to specifically and robustly amplify target regions before sequencing is important.

Conclusion.—Protocols for validation of laboratory-developed sequencing assays may vary between laboratories. An example summary of a validation is provided.

As more genes are implicated in disease, the desire and need to analyze genetic information from patient samples has increased dramatically in recent years. Some genes, especially for common diseases, have been extensively studied, while others are relatively newly discovered and need further study. Sanger dideoxy terminator DNA sequencing is a widely used technique to interrogate genes for small mutations and is considered a gold standard for detecting these sequence changes. Other technologies are required for detection of large rearrangements or copy number variations such as large deletions or duplications. Sequencing is especially useful when mutations are scattered across the entire gene or when genes have not been sufficiently studied to determine mutational hot spots. With the increased development of clinical sequencing assays the question arises how to validate such assays. There are general guidelines published by the American College of Medical Genetics (ACMG), the College of American Pathologists (CAP), the National Committee for Clinical Laboratory Standards (NCCLS) (now Clinical and Laboratory Standards Institute, CLSI),1 the Association for Molecular Pathology,2 and others regarding what is required for analytical test validations. With those established general guidelines in mind, we describe an approach for analytical validation of clinical sequencing assays. Sequencing assay design, validation criteria, and quality measures will be addressed. We use examples of assays that were developed for the MECP2 and the SMAD4 genes to demonstrate specific challenges associated with the validation of DNA sequencing assays. In addition, we include a summary for the complete sequence-based assay validation for PTEN, which is used to aid in diagnosing PTEN hamartoma tumor syndrome.

There are several important considerations regarding the design of a Sanger DNA sequencing assay. The first consideration is to identify and become familiar with the reference sequence that will be used. In general, the reference sequence should be the latest updated sequence and the complementary DNA (cDNA) selected from the most common transcript, typically obtained from the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/; accessed December 6, 2010). Possible existence of biologically relevant alternative transcripts and existence of pseudogenes need to be considered. Secondly, the regions of the gene to include in the analysis should be determined. Description of the gene regions interrogated (ie, promoter, 5′ untranslated region [5′ UTR], exons, introns, and 3′ UTR) and the mutations tested could define the reference and reportable ranges for a molecular sequencing assay. Reportable range also includes the homozygous or heterozygous status of a detected mutation. The distribution of known mutations can aid in that decision. For example, Strom et al3 have established an assay for CFTR that includes 98.7% of the known mutations associated with cystic fibrosis, based on the available locus-specific database. If extensive data regarding the mutation distribution are not available, then usually all coding exons including 20 to 50 nucleotides of flanking intronic sequences are covered. Promoter regions and deep intronic mutation sites are covered if mutations in these regions are well characterized. Primers can be designed by using available software programs for primer design, such as Primer34 and NCBI's Probe Database for primer selection. Primer sequences can be checked against single nucleotide polymorphism (SNP) databases, such as NCBI's dbSNP and Ensembl, to avoid polymorphic positions. Homology checks, such as NCBI's BLAST (basic local alignment search tool) analysis, can give an indication of the specificity of the primer and check for homology to other genes or pseudogenes that may interfere or compete with primers. Primers with extensive cross-homology to other genes should be avoided if possible.

Regions of interest that contain a repeat motif may produce stutter in the sequence. One strategy to combat this is to design 2 sets of primers, one including and one excluding the repeat, so that the region of interest is covered by at least 2 sequence reactions. For instance, if there is a repeat region immediately adjacent to an exon, we would analyze the forward sequence from the amplicon that excludes the repeat and the reverse reaction containing the repeat to obtain adequate coverage.

Once primers have been designed, optimizing the assay is critical. Optimization refers to the testing of various polymerase chain reaction (PCR) conditions (eg, annealing temperature, Mg2+ concentration) to identify an optimal set of conditions for PCR amplification of the DNA sequencing template. Poor sequencing reactions frequently result from suboptimal PCR sequencing template, and repeating reactions for sequencing assays is costly and time-consuming. A good indicator for success of sequencing reactions is the quality of the PCR template. As part of the validation process, data collected from agarose gels can be used to determine the percentage of time that an amplicon amplifies without problems, as illustrated in Figure 1. Problems can include failure to amplify, very weak amplification, and presence of primer dimers or nonspecific products. Primers should be redesigned for amplicons that do not amplify sufficiently well despite optimization.

Figure 1.

Example of a spreadsheet tracking the amplification quality of each amplicon for the MECP2 gene throughout the validation. The spreadsheet shows 5 samples used for accuracy studies, 3 samples used for within (intra)– and between (inter)–run precision studies, and several control samples. Most amplicons in this example consistently amplified well (green), some had sporadic primer dimers and some failed to amplify. For amplicon 2, we would redesign new primers that would improve the percentage of good amplification to greater than 80%. Abbreviations: F, failed (orange); N/A, not applicable; PD, primer dimer (yellow); U, unspecific amplification.

Figure 2. Determining the amplicon length for the calculation of the percentage of bases with quality values of 20 (%QV20) and greater. Abbreviations: F, forward; R, reverse; %QV20+, percentage of bases with QV20+.

Figure 1.

Example of a spreadsheet tracking the amplification quality of each amplicon for the MECP2 gene throughout the validation. The spreadsheet shows 5 samples used for accuracy studies, 3 samples used for within (intra)– and between (inter)–run precision studies, and several control samples. Most amplicons in this example consistently amplified well (green), some had sporadic primer dimers and some failed to amplify. For amplicon 2, we would redesign new primers that would improve the percentage of good amplification to greater than 80%. Abbreviations: F, failed (orange); N/A, not applicable; PD, primer dimer (yellow); U, unspecific amplification.

Figure 2. Determining the amplicon length for the calculation of the percentage of bases with quality values of 20 (%QV20) and greater. Abbreviations: F, forward; R, reverse; %QV20+, percentage of bases with QV20+.

Close modal

The use of universal sequence-tagged primers, such as M13-tagged primers, may streamline postamplification procedures by allowing all sequencing reactions to be performed with universal forward and reverse sequence primers. However, in some cases, this may increase primer-dimer formation during PCR reactions. Such primer dimers may prevent adequate specific amplification, such that non–M13-tagged primers may need to be redesigned.

To aid test interpretation, publicly available locus-specific mutation databases can be accessed. These are very helpful in the interpretation of DNA sequence variants found in patients. If there is no publicly available database, laboratories could implement their own database based on a review of the literature. Known or obviously pathogenic mutations (eg, frameshift or nonsense mutations) and known benign polymorphisms should be clearly classified. Mutations identified, but not yet classified as benign or pathogenic, should also be clearly marked. These variants of undetermined significance are a common interpretive conundrum for clinical molecular diagnostic laboratories. It is helpful to include in databases the original description of each mutation/polymorphism as well as references that give evidence of classification, such as demonstrated in a GALT mutation database5 and the Alport database6 (http://www.arup.utah.edu/database; accessed December 6, 2010). The ACMG7 has provided guidelines to variant classification, including an approach to investigate variants of undetermined significance.

Performance Characteristics

Accuracy

The purpose of determining the accuracy of an assay is to provide evidence that the test produces the expected results. The accuracy assesses the concordance between results obtained by the new assay and results obtained by another laboratory or results from another method or by another set of primers. Ideally, well-characterized samples containing known sequence variations are used as positive samples in addition to negative control samples. Samples may be available from other clinical or research laboratories or in the form of cell lines from commercial sources. Other methods that can be used as comparison techniques to validate the accuracy of sequencing results can include allele-specific PCR, cloning and sequencing of multiple clones, sequencing with a second primer set, as well as other molecular assays. In some cases, assumed negative control samples may reveal common or even rare sequence variants. One source of negative control samples is residual material from samples sent for clinical testing of other diseases. When these samples are used for validation, they should be deidentified before this testing is performed. Ideally the accuracy study is performed blinded with the operator not knowing the results of the comparison assay. However, because positive control samples can be of limited quantity, targeting analysis to the exon(s) containing the mutation or the variant may streamline validation.

From the accuracy study data, we can calculate the analytical sensitivity, analytical specificity, and analytical accuracy of the assay. The analytical specificity is calculated by dividing the true negatives by the sum of the true negatives plus false positives [TN/(TN+FP)]. False positives could arise from the presence and the analysis of pseudogenes carrying a variant or multiple copies of a gene, some containing a variant. The sensitivity is calculated by dividing the number of true positives by the sum of the true positive and false negatives [TP/(TP+FN)]. In the context of sequencing, we consider a positive the presence of a sequence variant when compared to the reference sequence, while a negative result refers to the absence of variants when compared to the reference sequence. False negatives may occur owing to variants within primers (either PCR primers or sequencing primers if not M13 tagged), which cause preferential amplification of one allele over the other, or large deletions that are not detected by sequencing assays. Accuracy (concordance) is determined by (TP+TN)/(TP+FP+FN+TN). Analytical sensitivity and specificity of a sequencing assay have values usually greater than 99%. The assay would be redesigned if false positives or negatives were detected.

Precision (Reproducibility)

For qualitative assays such as DNA sequencing, “precision” refers to the reproducibility of the assay, that is, the ability to obtain the same results for a sample when the assay is performed repeatedly. To test the robustness and reliability of the assay, a reproducibility study is performed (within-run, between-run). This study assesses the reproducibility of results, PCR amplification quality, and sequence quality. The reproducibility study includes negative and positive samples of the same sample type used for clinical testing, and these samples are extracted with the same method that the clinical laboratory will be using to perform the assay for patients. Most samples for gene sequencing assays are whole blood, mainly with EDTA or acid-citrate-dextrose anticoagulants. If there are multiple possible sample types, a practical approach is to do a full validation on the main sample type and a shortened validation for additional sample types to ensure similar performance characteristics and rule out interfering substances. For example, if family-specific mutations are to be tested, additional sample types may be cultured amniocytes, saliva, buccal swabs, paraffin-embedded tissue, etc. The shortened validation can consist of the analysis of all amplicons of 1 sample if the amplification and sequence data are of sufficient quality.

For reproducibility, both intrarun and interrun reproducibility should be assessed. For the intrarun study, we typically amplify several samples in at least triplicates on the same thermal cycler and assess amplification quality by visualizing products on agarose gels. For interrun reproducibility, several samples are tested on 3 independent runs (on different days with different reagents) and on different thermal cyclers. To reduce costs, one of these runs may be from the accuracy and one from the intrarun reproducibility, so that only 1 additional run is necessary. For interrun precision, amplification quality and sequencing results and quality are also assessed. Objective measures of quality can be used in addition to correct genotype calls (see next section). Because reproducibility studies require large amounts of samples, especially for larger genes with many amplicons, an artificial patient made from “piecing together” samples from several DNA sources may be considered, since each amplicon is compared to its replicate and not to other amplicons.

Reportable Range

This parameter is not strictly applicable to this qualitative assay. However, the reportable range can be interpreted to include the regions of the gene that are to be included in the analysis. It could be defined as a description of the gene regions analyzed and the mutations tested. The mutation distribution in a given gene can be taken into consideration. The reportable range also encompasses whether an identified mutation is homozygous or heterozygous.

Reference Range

This parameter is not strictly applicable to this qualitative assay. However, the reference range can be interpreted to mean the reference sequence used for interpretation of the results.

Limit of Detection

A number of terms are associated with analytical sensitivity, including limit of detection and limit of quantitation. To avoid confusion with “analytical sensitivity” as described above, we have chosen to use the term limit of detection (LOD). Although the LOD is usually not a concern for DNA sequencing assays for inherited disorders (germline mutations), knowing the LOD of the assay is crucial if the DNA sequencing assay is being used to detect the presence of a low-level mutation in a background of normal DNA. Examples of applications needing LOD are somatic mutations in cancer, mitochondrial heteroplasmy, or mosaicism. Generally, Sanger sequencing is used as a qualitative assay and is recognized as not having low LOD. Typically, Sanger sequencing detects 10% to 20% mutated DNA in a background of normal DNA.8 A sample that has less than this amount of mutant DNA would most likely appear as wild type by sequencing. In these cases, mutation levels below the LOD will lead to false-negative results and affect the analytical sensitivity [TP/(TP+FN)] of the test.

The LOD of a sequencing assay can be determined by performing dilution experiments. For example, a sample that harbors a known (heterozygous or homozygous) mutation is diluted into a separate sample that is wild type for the same mutation. Although dilutions may be done with cell lines, based on cell count, most often dilutions are performed by using known quantities of DNA. The NCCLS (now CLSI) gives guidelines for studying mixtures by sequencing.1 

Assay Specificity

The term analytical specificity also has dual meanings. As explained above, it describes the false-positive rate of an assay. It also describes how the assay is specific for the analyte and for the sample type and therefore can be referred to as assay specificity. For molecular sequencing assays, one interpretation is the sequence specificity of the primers. BLAST searches and gel checks for nonspecific amplification are methods to establish specificity parameters. Interfering substances are also included in this category. Heparin is known to interfere with PCR amplification. Often a simple dilution (1∶5–1∶20) can overcome heparin's inhibitory effect.

Quality Measures

One of our areas of focus during the interrun study is the quality of the sequence. Several objective measures are available to assess this, among them, signal intensity, signal to noise ratio, trace scores, and the number or percentage of bases in a trace above certain quality scores. The signal intensity can vary greatly and is dependent on the individual instrument and run parameters as well as the sample being analyzed. For an Applied Biosystems 3730 DNA Analyzer (Life Technologies Corporation, Carlsbad, California), signal strengths of about 1000 to 5000 relative fluorescent units (RFU) are ideal. When the signal intensity drops below 200 RFU, there is increased noise in the sequence, which leads to more erroneous base calls. Above about 10 000 RFU, “pull-up peaks” or bleed-through, due to spectral overlap of the dyes, can interfere with base calling. Regarding the signal to noise ratio, the NCCLS states that background peaks should not exceed 5% in a sequence.1 This translates into a signal to noise ratio of 20 and seems a reasonable cutoff point. To quickly assess the sequence quality, a trace score,9,10 which is an average quality score across the trace, as well as the percentage of bases with quality values greater than 20 (%QV20+), is useful. To calculate the percentage of bases with QV of 20+, the number of bases with QV20+ provided by ABI's Sequence Scanner software is divided by the amplicon length minus the length of the sequence primer (Figure 2). This amplicon length, excluding the sequencing primer used, gives the theoretical maximum length of sequence. In practice, amplicons with %QV20+ of about 90% have generally good sequence quality. Percentages below about 80% may be due to bad quality. However, this can also be caused by a repeat region followed by stutter in the sequence. Therefore, if amplicons with low quality scores are seen during validation, the reason should be determined. If there is a problem that can be avoided, primers should be redesigned for the specific amplicon. The Table shows an example of a sequence-quality tracking spreadsheet. Data with low quality for an exon with a repeat are given as an example (Table, exon 3). It was divided into a short and a long amplicon. The repeat in this case is close to the beginning of the exon and the forward sequencing reaction comes from the short amplicon not containing the repeat. The reverse reaction comes from the long amplicon to ensure that the splice sites are covered with adequate sequence. Beyond the repeat in the reverse reaction we expect low-quality sequence and therefore, the lower quality scores for the reverse reactions are justified.

Example of a Spreadsheet Tracking the Sequence Quality Indicators for Amplicons Analyzed in the SMAD4 Genea

Example of a Spreadsheet Tracking the Sequence Quality Indicators for Amplicons Analyzed in the SMAD4 Genea
Example of a Spreadsheet Tracking the Sequence Quality Indicators for Amplicons Analyzed in the SMAD4 Genea

Although not currently required by CLIA, clinical sensitivity and specificity for the assay can be incorporated into assay validation. Clinical specificity refers to the number of unaffected individuals who have pathogenic mutations in the gene of interest within regions the assay interrogates. In genetic terms, this is the disease penetrance.

Clinical sensitivity refers to the number of patients with the disease in which mutations can be identified with the assay. If laboratories have access to clinically diagnosed patients with the disease, validations could include a clinical sensitivity study. However, identifying enough patients with disease at one's institution may be unfeasible, especially when validating assays for rare diseases. Comparison of the assay to published clinical studies covering the same or similar regions of the gene can estimate clinical sensitivity.

A summary of our validation studies includes an introduction, reference sequence, primer positions (PCR and sequencing primers), accuracy, (including analytical sensitivity and analytical specificity), precision, reportable range, reference range, limit of detection (when applicable), conclusions, and bibliographic references for clinical sensitivity and specificity. This summary is provided to clinical testing personnel and made available for CAP inspectors as part of the laboratory inspection. Analytical performance characteristics that may be important for the clinical information of the assay, collected during validation, such as the analytical sensitivity, specificity, LOD (for disorders that show mosaicism), regions of the gene interrogated, and assay limitations, are provided on the description of test (such as in a test catalog) available to clinicians ordering the test. Integration into the clinical workflow includes preparing a standard operating procedure and training technologists who will perform the test to ensure that others can perform the test to the same quality standards as during the validation. A laboratory may have a general sequencing standard operating procedure, but will need test-specific information as well. Training technologists includes not only technical aspects of running the test, but also disease information to aid in the understanding of a result and its interpretation. Report templates for negative, positive, and uncertain (when a variant of unknown significance is found) results are drafted; yet, given the variety of testing indications and combination of variants, we often customize sequencing reports. Individualized reports may request familial samples to better interpret sequencing results, recommendations for additional testing, or a summary of the literature describing the current knowledge of a specific sequence variant. Components and examples of molecular test reports have been further described by Gulley et al.11 Keeping internal databases of variants detected through clinical testing helps maintain consistency in reporting and in variant classification and reclassification. Proficiency testing should be put in place for the new assay. Proficiency testing poses a challenge for sequencing-based assays, since a test may be offered by only one or just a handful of laboratories for a particular gene. Programs offered through the CAP Biochemical and Molecular Genetics Resource Committee include a technical sequence challenge and an alternative proficiency program that matches laboratories that perform less common tests (Sample Exchange Registry for Alternative Assessment; http://www.cap.org/apps/cap.portal?_nfpb=true&cntvwrPtlt_actionOverride=%2Fportlets%2FcontentViewer%2Fshow&_windowLabel=cntvwrPtlt&cntvwrPtlt%7BactionForm.contentReference%7D=laboratory_resources%2Fexc.html&_state=maximized&_pageLabel=cntvwr; accessed November 9, 2010).

Full-gene sequencing is the method of choice to detect mutations in many genes. Before patient testing can begin, however, a sequencing assay needs careful design and validation to establish and ensure its performance characteristics. During the design phase, primers are carefully selected to be specific to the target region and to include as many mutation sites in the amplicons as possible. The assay validation has several parts. An accuracy study helps to ensure that genotypes of known samples can be detected. A precision study helps to determine the robustness and reproducibility of the assay. Validating the quality of the PCR amplification preceding the sequencing is of paramount importance to obtain good-quality sequence. In the Appendix (see Pont-Kingdon et al supplemental material file at www.archivesofpathology.org in the January 2012 folder), we present the validation summary for PTEN as an example of validation. This summary includes a disease summary, methods and primer sequences, accuracy results, precision (reproducibility), references, examples of results, and an annotated reference sequence.

To assess the quality of the sequence reactions, quality scores obtained from both accuracy and reproducibility studies are helpful quality measures. Finally, the implementation phase needs to be carefully considered and completed before the assay can “go live.”

This work was supported by the ARUP Institute of Clinical and Experimental Pathology.

1.
Zoccoli
MA
,
Chan
M
,
Erker
JC
,
Ferreira-Gonzalez
A
,
Lubin
IM
.
Nucleic Acid Sequencing Methods in Diagnostic Laboratory Medicin; Approved Guideline. NCCLS document MM9-A
.
Wayne, PA
:
NCCLS
;
2004
.
2.
Association for Molecular Pathology
.
Association for Molecular Pathology statement: recommendations for in-house development and operation of molecular diagnostic tests
.
Am J Clin Pathol
.
1999
;
111
(
4
):
449
463
.
3.
Strom
CM
,
Huang
D
,
Chen
C
, et al.
Extensive sequencing of the cystic fibrosis transmembrane regulator gene: assay validation and unexpected benefits of developing a comprehensive test
.
Genet Med
.
2003
;
5
(
1
):
9
14
.
4.
Rozen
S
,
Skaletsky
H
.
Primer3 on the WWW for general users and for biologist programmers
.
Methods Mol Biol
.
2000
;
132
:
365
386
.
5.
Calderon
FR
,
Phansalkar
AR
,
Crockett
DK
,
Miller
M
,
Mao
R
.
Mutation database for the galactose-1-phosphate uridyltransferase (GALT) gene
.
Hum Mutat
.
2007
;
28
(
10
):
939
943
.
6.
Crockett
DK
,
Pont-Kingdon
G
,
Gedge
F
,
Sumner
K
,
Seamons
R
,
Lyon
E
.
The Alport syndrome COL4A5 variant database
.
Hum Mutat
.
2010
;
31
(
8
):
E1652
E1657
.
7.
Richards
CS
,
Bale
S
,
Bellissimo
DB
, et al.
ACMG recommendations for standards for interpretation and reporting of sequence variations: revisions 2007
.
Genet Med
.
2008
;
10
(
4
):
294
300
.
8.
Tsiatis
AC
,
Norris-Kirby
A
,
Rich
RG
, et al.
Comparison of Sanger sequencing, pyrosequencing, and melting curve analysis for the detection of KRAS mutations: diagnostic and clinical implications
.
J Mol Diagn
.
2010
;
12
(
4
):
425
432
.
9.
Ewing
B
,
Hillier
L
,
Wendl
MC
,
Green
P
.
Base-calling of automated sequencer traces using phred: I, accuracy assessment
.
Genome Res
.
1998
;
8
(
3
):
175
185
.
10.
Ewing
B
,
Green
P
.
Base-calling of automated sequencer traces using phred: II, error probabilities
.
Genome Res
.
1998
;
8
(
3
):
186
194
.
11.
Gulley
ML
,
Braziel
RM
,
Halling
KC
, et al.
Clinical laboratory reports in molecular pathology
.
Arch Pathol Lab Med
.
2007
;
131
(
6
):
852
863
.

Author notes

From ARUP Laboratories, Institute of Clinical and Experimental Pathology, Salt Lake City, Utah (Drs Pont-Kingdon and Lyon and Mses Gedge, Wooderchak, and Bayrak-Toydemir); the Department of Pathology, Stanford University School of Medicine, Stanford, California (Dr Schrijver); the Department of Pathology and Laboratory Medicine, University of North Carolina, Chapel Hill (Dr Weck); the Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania (Dr Kant); the Department of Pathology, Mayo Clinic, Rochester, Minnesota (Dr Oglesbee); and the Department of Pathology, University of Utah, Salt Lake City (Ms Bayrak-Toydemir and Dr Lyon).

The authors have no relevant financial interest in the products or companies described in this article.

Supplementary data