The number of targeted next-generation sequencing (NGS) panels for genetic diseases offered by clinical laboratories is rapidly increasing. Before an NGS-based test is implemented in a clinical laboratory, appropriate validation studies are needed to determine the performance characteristics of the test.
To provide examples of assay design and validation of targeted NGS gene panels for the detection of germline variants associated with inherited disorders.
The approaches used by 2 clinical laboratories for the development and validation of targeted NGS gene panels are described. Important design and validation considerations are examined.
Clinical laboratories must validate performance specifications of each test prior to implementation. Test design specifications and validation data are provided, outlining important steps in validation of targeted NGS panels by clinical diagnostic laboratories.
With the advent of massively parallel sequencing, commonly called next-generation sequencing (NGS), methodologies, the number of genes implicated in human disease has increased substantially in the last decade. The increase in gene discovery has led to a surge in the number of clinical laboratory tests offered to detect genetic variants associated with inherited disorders. Many disorders, such as sensorineural hearing loss, cardiomyopathy, and RASopathies, are genetically and clinically heterogeneous with variants in numerous genes resulting in the overlapping phenotypes. In contrast to a sequential (gene-by-gene) testing approach, such as Sanger sequencing, a disease-targeted NGS panel focused on the simultaneous analysis of a set of genes associated with a specific clinical indication is often a suitable cost-effective alternative. A laboratory must take gene- and disease-specific parameters into consideration when designing and analytically validating NGS-based gene panels for clinical testing. General guidelines for clinical NGS assays have been published by the American College of Medical Genetics and Genomics, the College of American Pathologists, the National Committee for Clinical Laboratory Standards, and the Association for Molecular Pathology.1–3 The College of American Pathologists Biochemical and Molecular Genetics Committee has previously published examples of assay validation for molecular genetic testing, including a methods-based approach for validation of laboratory-developed testing by Sanger sequencing and verification of a US Food and Drug Administration–approved assay for cystic fibrosis mutation testing.4,5 In this manuscript, we describe examples of the design and validation of NGS targeted panels for inherited disorders. Key considerations for test design, assessment of the validity of gene-disease relationships, validation criteria, and quality measures are addressed. Specifically, a methods-based validation approach6 that was implemented at the Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, is described. We report an integrated validation strategy using HapMap samples and samples with specific disease variants for a combined targeted NGS panel for 5 diseases, including early infantile epileptic encephalopathy, craniofacial disorders, RASopathy disorders, hearing loss, and hereditary cancer. Additional test design considerations are highlighted using examples from the Laboratory for Molecular Medicine at Partners HealthCare Personalized Medicine, Boston, Massachusetts.
FAMILIARIZATION AND PLANNING
Disorders with significant locus and allelic heterogeneity, that is, those caused by multiple different sequence variants in one of several different genes, are typically prioritized for panel testing. Key considerations including specimen volume, expected turnaround time, and calculations of labor, time, and cost of reagents to perform and analyze the test can inform decision making about clinical test development. As with other molecular genetic tests, NGS germline panels can have several useful applications, such as confirming a clinical or prenatal diagnosis, facilitating presymptomatic surveillance, and developing strategies for management and early intervention. It is important to select genes with sufficient scientific evidence of a causative role in the disease, as variants in genes that are not yet established as disease causing are difficult to interpret and can lead to inconclusive results. Determination of the clinical validity of genes and corresponding variants often relies on the evidence presented in published literature. However, the type and depth of published evidence vary greatly for different genes, and objectively assessing the clinical validity of the disease association of genes can therefore be challenging. Thus, it is important for clinical laboratories to establish an objective method to curate and classify the evidence used to determine the strength of gene-disease associations. At the present time there are no expert or regulatory guidelines as to what level of evidence warrants inclusion of a gene in a test designed for diagnosis of a specific inherited disorder, and as a consequence, test content can vary substantially across testing laboratories. A comprehensive evidence-based framework for evaluating gene-disease association validity has been recently made available by the Clinical Genome Resource or ClinGen (https://www.clinicalgenome.org/knowledge-curation/gene-curation/).
Once the gene content for a targeted NGS panel is determined, the next step is to determine the genomic region of interest. Many genes have multiple, alternatively spliced transcripts whose spatiotemporal expression can vary. Currently, there is no consensus among laboratories or specifications by regulators in selection of which transcript should be used for sequencing analysis and annotation. Existing approaches range from using a single transcript (eg, the one with the most exons or the one that is predominantly expressed in the tissue of interest) to a more inclusive “all-exon” approach. The constraint to the latter approach is that the relative importance of individual transcripts is often not well defined. Because multiple transcripts may be defined for a particular gene, it is important to include the transcript used for variant reporting in the laboratory report by referencing the messenger RNA transcript and protein sequence numbers for complementary DNA and protein nomenclature respectively. Once the transcripts for each gene have been selected, coding exons with flanking intronic regions are used to define the region of interest. There is currently no consensus on the length of intronic sequence that should be included in analysis, although most laboratories include sequences from ±10 to 20 bases past the intron-exon boundary in order to detect intronic mutations in the splice donor and acceptor sequences. However, it may be important to include more deep intronic regions, for example if known pathogenic mutations occur within an intron in a specific gene.
A thorough review of the variant spectrum associated with each gene is essential to identify common pathogenic variants or hot spots and pathogenic variants located outside of typically covered exonic regions, such as deep intronic or untranslated regions. This information is essential to determine the targeted genomic region of interest and can be valuable when selecting validation specimens. It is also important for determining the clinical sensitivity of testing, that is, what percentage of patients with disease will have a mutation that is detectable by the targeted region. For example, CFTR-related disorders are caused by single-nucleotide variants (SNVs) in exons, 5′ and 3′ untranslated regions, deletions, insertions/duplications, complex rearrangements, and intronic repeat variations. Another example is a common pathogenic variant in the Fabry disease gene (GLA) that would be missed if a standard exon-targeted design is applied. The deep intronic c.640-801G>A variant in GLA is a frequent cause of the X-linked cardiac type of Fabry in the Taiwan Chinese population.7
Information about the disease, including key clinical indicators, disease mechanism, prevalence, mode of inheritance, penetrance, and expressivity, should be investigated at the test design stage. All of these factors play a critical role when interpreting results and writing a clear, concise report.
For the methods-based validation approach described below (see Analytical Sensitivity, Specificity, and Precision section), 151 genes associated with 5 diseases were combined in one panel (12 genes for RASopathy, 97 for hearing loss, 17 for craniofacial disorders, 11 for hereditary cancer, and 15 for early infantile epileptic encephalopathy; Supplemental Table 1, contained in the supplemental digital content [containing 5 tables and 1 figure]) is available at www.archivesofpathology.org in the June 2017 table of contents. One gene, MSX2, was present in both the craniofacial and hearing loss syndromes. Genes and diseases were reviewed and scored for clinical validity. Information about the disease mechanism, mode of inheritance, and transcripts was noted in a central database (data not shown here). All coding exons (±10 base pairs [bp] into the intron) were targeted for each gene. In total a ∼0.5-Mb region was targeted for panel design and development.
Using the general guidelines listed above, the following is an example of gene curation for FGFR3 in designing a targeted NGS panel for a craniosynostosis panel. Variants in FGFR3 have been observed in 100% of individuals with Crouzon syndrome with acanthosis nigricans and in 100% of individuals diagnosed with Muenke syndrome.8 FGFR3 has 3 transcripts and 18 coding exons. In severe presentations, de novo pathogenic variants in affected individuals are observed. Advanced paternal age has been reported to be associated with de novo pathogenic variants in Muenke syndrome.9 The majority of pathogenic variants in FGFR3 are missense changes that result in an autosomal dominant gain of function effect. There is one recurring pathogenic variant, c.749C>G; p.Pro250Arg, that is the single cause of Muenke syndrome.10 Based on this information, the targeted capture panel was designed to include genomic regions that encompass the coding region. Validation experiments were designed to include a positive control to confirm that the c.749C>G; p.Pro250Arg variant could be detected using the laboratory-based approach (Table 1).
During the design of an NGS gene panel, is it important for the laboratory to be aware of technical limitations of NGS technology. Many of these limitations may be inherent to all technologies, but some are specific to particular enrichment, sequencing, or bioinformatics techniques.
Interference of Homologous Sequences
There are significant challenges in interrogating medically significant genes with high sequence homology. Some genes, or parts of genes, may not be adequately captured or sequenced to allow for confidence in quality of the data. These include genes with complex sequence contexts such as pseudogenes, genetic rearrangements, and a high GC content. Regions of high homology with other genomic regions, such as pseudogenes or gene duplication events, may lead to false-positive and/or false-negative results due to mismapped reads. It is critical that laboratories assess regions of homology to identify genomic regions within the targeted gene panel that may not be uniquely present in the genome. Variant calls in highly homologous regions that cannot be accurately detected by NGS can often be resolved by other methods such as Sanger sequencing if gene-specific primers can be designed. If that is not possible, affected regions may need to be excluded from the panel. If the excluded gene or region is critical to diagnosis of the disease, other methods such as long-range polymerase chain reaction may need to be used. Regions that are difficult to interrogate by NGS, such as those with high or low GC content and homologous regions, are particularly important to assess during the assay validation, especially in a methods-based validation approach. Short read lengths can make sequence assembly and alignment challenging when homology to other loci is present. Target sequences therefore need to be carefully examined to determine if the sequence context is amenable to short-read NGS. If significant challenges are evident, non-NGS assays may need to be added to the test to ensure optimal clinical validity.11
Exon Level Deletions and Duplications
Deletion or duplication at exon level can be detected via NGS using several commercially available bioinformatics tools; however, the analytical sensitivity and specificity of these changes must be determined by the laboratory.12
Clinically significant homopolymer tracts and triplet repeat expansions are usually not able to be detected by standard NGS and are better analyzed using other methods.
Mosaicism and low levels of heteroplasmy for mitochondrial DNA variants may not be detected, depending on the depth of sequence coverage and limit of detection that is validated. Depending on the availability of parental DNA, the chromosomal phase of identified pathogenic variants may not be determined (ie, whether variants are in cis or trans). Rare variants in primer or probe hybridization sites may compromise analytical sensitivity. Because a multigene panel is typically focused on the coding regions of the gene, regulatory region and deep intronic variants may not be identified.
Based on the design and validation data, a laboratory may decide whether to use another method to fill in for genomic regions that cannot be accurately analyzed by NGS or to exclude the region from analysis. Genomic regions that are not covered by testing should be included in the assay description and laboratory report and clinical sensitivity calculations should be adjusted accordingly.
Below we provide an example from the Laboratory for Molecular Medicine during the development of a hearing loss panel, which lists genes with high homology and/or high GC content (see Supplemental Table 2 for complete gene list). Table 2 shows (1) the number of exons that are affected by high homology to other loci (here defined as 90% of all bases with a mappability score11 of <1) and (2) the number of exons with unusually high (>75%) or low (<35%) GC content. The STRC gene illustrates a gene that cannot be analyzed with standard short-read NGS methods (28 of 29 exons have high homology with a pseudogene, with long stretches being 100% identical across exons and introns between the 2 genes13). Because this gene is a key contributor to nonsyndromic hearing loss14 and clinical specificity (the ability to deduce the genetic cause based on the patient's clinical features) is low, it was deemed critical to be included in a diagnostic gene panel for inherited hearing loss, and this can be accomplished by supplementing the NGS assay with a long-range polymerase chain reaction assay that discriminates between the gene and its pseudogene.13 The TMC1 gene is also affected by homology and GC issues; however, in this case homology is restricted to 1 exon and 4 exons have low GC content. Based on whether these exons are critical, the laboratory director may decide to drop them from the test, particularly if unique Sanger sequencing primers cannot be designed for confirmatory purposes. In this case, unique Sanger sequencing primers were available for the TMC1 exon with homology to another genomic region. Much of the OTOGL gene (34 of 58 exons) has low GC content. Although this may result in poor coverage/data quality, it was considered worth generating validation data first to gauge the true extent of this problem. If the number of exons that fail NGS analysis is low, Sanger sequencing may be used to fill in insufficiently covered regions.
For genomic regions that are difficult to analyze by NGS, it may be advisable to investigate the feasibility of developing robust Sanger sequencing primers in parallel, for example to explore whether primer design is possible in these regions. The development of companion Sanger sequencing assays can be approached in different ways depending on the laboratory's general operational approach. For small gene panels, it is often possible and efficient to predevelop orthogonal assays for all exons covered by the test. This strategy does not scale with increasing gene content, as many assays will never be needed (either because no variant is ever detected that needs confirmation and/or the region performs robustly by NGS and do not need confirmation by Sanger sequencing). It may be more practical to restrict Sanger predevelopment to vulnerable regions that have an increased likelihood to fail. A pilot run will identify problem regions (ie, those that always fail), but the scope of most test development efforts is usually insufficient to allow identifying all genomic regions of reduced robustness. It is for those regions that an upfront in silico ascertainment of the targeted test region is most useful. Figure 1 summarizes key concepts and provides a decision matrix for dealing with genomic regions that are difficult to sequence by NGS technology.
Selection of Target Enrichment and Sequencing Techniques
A critical step in test development is choosing which sequencing technology and enrichment techniques to use. Several commercial NGS platforms are available. Each sequencing platform has specific parameters that differ in sequence capacity, sequence read length, sequence run time, and quality and accuracy of the data. Size of the targeted region, type of variation detected, required depth of coverage, projected sample volume, turnaround time requirements, and costs are all considered when choosing a sequencer. For a comprehensive review on NGS technologies, the reader is referred to Mardis15 and Metzker.16
All NGS targeted panels require enrichment of targeted genomic regions prior to sequencing. There are several strategies in which target enrichment can be achieved. These methods include polymerase chain reaction–based capture, molecular inversion probe–based capture, and hybrid capture methods. Each approach varies in sensitivity (percentage of target bases that are represented by one or more sequence reads), specificity (percentage of sequences that map to the intended targets), uniformity (variability in sequence coverage across target regions), reproducibility (correlation of results obtained from replicate experiments), cost, ease of use, and amount of DNA required. Mamanova et al17 provides a comprehensive review of target-enrichment strategies. The data shown here were generated with a targeted hybridization-based approach using SureSelect for target enrichment (Agilent Technologies, Santa Clara, California) and sequenced using the MiSeq system (Illumina, Inc, San Diego, California).
Bait Design Strategy
Vendors of target capture assays typically allow custom design of baits. One key consideration is bait density, as this will impact capture efficiency (especially in difficult regions). To ensure reliable capture, it is advantageous to choose a baiting strategy that covers each base more than once. However, for very large targets this may not be practical for economic reasons. If complete coverage is critical (which is typically the case for diagnostic NGS testing) an iterative design process may be an option, where a less dense bait tiling is tested first and underperforming regions are then optimized.
A second key consideration is the total bait territory. For certain hybridization-based enrichment techniques, off-target capture is expected. For regions that capture well, it may not be necessary to cover the entire targeted region with baits, as captured fragments typically extend beyond both ends of a given bait. However, coverage at the edges of the targeted region will always be significantly lower, and these bases often are insufficiently covered. If complete coverage above the minimal acceptable number of reads is desired, it can therefore be beneficial to extend the baited region beyond the actual region of interest. Figure 2 shows the impact of these baiting strategies on final coverage for a representative exon.
BIOINFORMATICS PIPELINE FOR ALIGNMENT, VARIANT CALLING, ANNOTATION, AND FILTRATION
Next-generation sequencing produces an extensive amount of sequence data that is typically processed and analyzed in 3 major steps. The primary step, executed by onboard instrument software, translates sequencing signals into linear sequence with associated individual nucleotide base quality scores analogous to Phred scores. This information is compiled into a file format termed .fastq, which is the input for the secondary step during which sequence reads are aligned to a reference sequence. Aligned reads are compiled into a file format termed .bam. Key information in the .bam file includes read alignment location relative to the reference, read mapping quality, depth of read coverage per mapped location, and forward and reverse read distribution when bidirectional sequencing has been performed. The .bam files can be viewed in genome browsers that also allow visualization of variants in reads relative to the reference. The tertiary step uses the .bam file as input into software that determines differences between the aligned reads and the reference sequence and compiles those differences into a variant call file format.18 The tertiary step also includes annotation of the variants (eg, assignment of c. and p. nomenclature) and association of variants with metadata (eg, variant frequency in populations). Each step is complex and to accomplish them requires a combination of algorithms and software that may be open source or commercial. The choices of algorithms and software are influenced by sequencing chemistry and instrumentation, the application and types of variants to be detected (eg, SNVs or copy number variants), and the bioinformatics expertise of the laboratory. Critically, it has been shown that different bioinformatics pipelines generate differences in variant outputs and accuracy.19–21 The imperfect and evolving state of NGS bioinformatics poses challenges for clinical laboratories with regard to choice and evaluation of bioinformatics tools. Further discussion on NGS bioinformatics principles can be found in O'Rawe et al,19 Reumers et al,20 and Ross et al.21
Once the bioinformatics pipeline has been optimized, a comprehensive validation is performed using sequence reads generated from samples with known variants covering the spectrum of the diagnostic test. As described above, a sufficient number of samples should be analyzed to assess the pipeline's analytic and diagnostic sensitivity and specificity as well as precision. Once a pipeline has been validated, any changes to the protocol need to be documented and revalidated.
For the validation data shown below (see Analytical Sensitivity, Specificity, and Precision), a methods-based validation was performed that encompassed the bioinformatic elements. Read alignment and variant calling were performed with an in-house bioinformatics pipeline that incorporated NovoAlign (Novocraft, Selangor, Malaysia) for read alignment and Picard (for duplicate removal) and the Genome Analysis Toolkit (Broad Institute, Cambridge, Massachusetts) for downstream processing and variant calling (reference sequence: hg19v37). Variant annotation and initial variant filtration were performed with Bench Lab NGS software (Cartagenia, Cambridge, Massachusetts) for variants with coverage of 5× or more. This filtration restricts the data to variants in the Human Genome Mutation Database and/or rare variants with a coding effect such as nonsynonymous, stop loss, stop gain, start loss, insertions/deletions (indels), frameshifts, and variants within the consensus splice site (6 bases in the intron and 2 bases in the exon). Additional information about the alignment and variant calling pipeline is available in Supplemental Figure 1.
All algorithms, software, customizations, and databases used in the analysis of NGS data were documented and versioned. Quality control parameters were developed and documented. Parameters and thresholds that determine the overall quality of a successful sequencing run were established (see Quality Assurance and Quality Control and Supplemental Table 3).
Analytical Sensitivity, Specificity, and Precision
Once the methodology is established and the protocol is optimized in the laboratory, the entire test should be validated, including all steps in the process (wet bench as well as the bioinformatics analyses) using all sample types that will be accepted for the panel (eg, whole blood; saliva; formalin-fixed, paraffin-embedded tissue; buccal swab; cultured amniocytes and chorionic villi). Regulatory requirements and quality management system standards require that laboratories determine assay performance characteristics including analytical sensitivity and specificity and precision or reproducibility.5,22,23 All 3 of these measures are determined by testing samples that are from individuals with known sequence variants and known negative controls. For validation of an NGS panel, it is not feasible to identify and analyze controls for every possible mutation within the targeted genes; therefore, a methods-based validation approach was taken. The methods-based validation approach incorporates samples with known mutations, particularly targeted to common mutations and specific types of variants or genomic regions that may be more difficult to detect, such as indels, GC-rich regions, and regions of repetitive sequence. Positive control samples that have high-confidence SNV and indel calls by whole-genome sequencing, such as NA12878, are available through the Coriell Institute for Medical Research, Camden, New Jersey, and the National Institute of Standards and Technology, Gaithersburg, Maryland. These data were generated by the Genome in a Bottle Consortium by integrating and arbitrating among 14 data sets.24,25 Positive controls may also be obtained through clinical and/or research laboratories by using previously tested methods such as NGS or Sanger sequencing or SNP arrays.
Analytical sensitivity is the likelihood that the assay will detect a sequence variant when present within the targeted region (1 − false-negative rate). This is determined by dividing the number of known variants (true positives) detected by the NGS targeted panel by the total number of known variants detected by a reference method or data set. It is recommended that recurrent disease-causing variants be included in the analyses because these may be seen frequently in a disease cohort.6
Analytical specificity is the likelihood that the assay will be negative when no variant is present (1 − false-positive rate). This measurement is established by dividing the number of true negatives (known reference alleles) by the sum of true negatives and false positives, typically obtained by comparison with the results obtained by a reference method such as Sanger sequencing (or the National Institute of Standards and Technology's high-confidence sequence generated for NA12878).
Knowing that current sequencing platforms and bioinformatic pipelines exhibit differences in their capacity to detect different classes of genetic variations, it is recommended that analytical sensitivity and specificity be established separately for each type of sequence variation such as SNVs, indels, and copy number variants, if applicable.
For the methods-based validation study conducted at the Children's Hospital of Philadelphia, 30 samples were used. Among these, 15 samples were previously characterized to carry pathogenic mutations in various target genes across the 5 disease groups (Table 1). The remaining 15 samples were negative controls, which included 13 DNA specimens that tested negative for mutations in selected genes and 2 HapMap samples (NA12878 and NA19240). Genomic DNA was extracted from blood or other patient tissues following standard DNA extraction protocols in the laboratory. Coding regions with 10-bp flanking intronic sequences of genes of interest were enriched using the SureSelectXT Target Enrichment System (Agilent Technologies) for Illumina Paired-End Sequencing Library. Differentially indexed postcapture libraries were sequenced using the Miseq 2 × 150-bp V2 Regent Kit (Illumina).
In order to determine sensitivity and specificity of this assay, additional data sets were obtained for select patient samples that had been analyzed previously by alternative technologies. Single-nucleotide variant array data were obtained for 10 of the samples from either the Children's Hospital of Philadelphia cytogenomics laboratory or public databases (such as the 1000 Genomes and EVS database). In addition, whole-genome sequencing data for the 2 HapMap samples were obtained from the Broad Institute and Illumina, respectively, and the consolidated SNV data by the Genome in a Bottle Consortium24 for one of the HapMap samples (NA12878). For 5 patients, variant information was available on the Noonan panel of genes through a previously validated NGS protocol. Detailed information for samples used in this validation study and corresponding reference data sets are listed in Table 1.
Analytical sensitivity and specificity of the assay were calculated by comparing variants identified in this assay with variants identified in the reference data sets. For samples with SNV array results available, every position with an array call was analyzed for concordance with the result obtained through the NGS assay (ie, the MiSeq result). Discordant variants were further resolved using Sanger sequencing analysis. Results of sensitivity and specificity studies are shown in Table 3 and Supplemental Tables 4 and 5.
Recurring false-positive variants in HRAS and MAP2K2 were identified (Supplemental Table 4). Both of these variants were flagged to have poor quality scores and would have been flagged by the laboratory for confirmation by Sanger dideoxy sequencing analysis. Repeating the assay with the same specimen with a new enrichment kit led to elimination of the 2 false-positive variants. It is unclear whether it was the enrichment kit or a sample preparation error in the original assay that led to the resolution of the discordant variants. Based on these results, it is recommended that laboratories leverage validation studies to understand the sources of false negatives and false positives and develop strategies to address them. For example, laboratories may choose to review the quality and alignment of the data using tools such as the Integrative Genomics Viewer.26 It is recommended that laboratories develop quality metrics for acceptability of variant calls and a policy on when to confirm variants by an orthogonal method such as Sanger sequencing. Based on our experience with this validation and other additional data not shown here, we have set the following parameters for confirmation of variants by Sanger sequencing: (1) any variant with a read depth less than 20, (2) call quality less than 500 (Phred score of confidence P value), (3) genotype quality less than 99, (4) strand bias greater than 80% of variant reads align to single strand or (5) an allele frequency less than 40% for heterozygous variants or less than 95% for homozygous variants, and (6) any reportable disease causing variant (classified as variant of uncertain significance, likely pathogenic or pathogenic; Table 4). To be noted, these parameters are not meant for universal use because these are specific and unique to the sequencing and bioinformatics pipeline being used in a laboratory and are likely to undergo modifications as chemistries and informatics tools get updated. It is recommended that every laboratory determine these parameters based on their experience with their internal laboratory protocols.
Two HapMap samples were also used during the validation study. For NA12878, variants within the targeted regions were compared with a reference variant list. Discrepancy among the reference data sets was resolved by further examining the GATK filter and quality score for the Broad WGS data set, variant context, and the filter information in the GIAB variant list. For this study, true negatives were defined as positions without variants in the comparison reference sample. True positives were defined as positions with heterozygous or homozygous variant calls in the comparison reference sample. Comparison between the reference data set and the NGS panel data showed 1 false-positive SNV call and 1 false-positive indel in the panel data set. The false-positive SNV call showed very strong strand bias in the .bam file and the indel call was identified within a homopolymer region (>20 A), indicating that both variants were unlikely to be true positives. In summary, for NA12878, more than 469 121 positions within the targeted region were correctly called as true negatives and 264 variants were called correctly as true positives. For HapMap sample NA19240, the WGS variant data from Illumina and the 1000 Genomes Omin2.5 array data set were obtained and used as the reference data sets. Comparison between the reference data set and the variant set from the panel indicated that 469 047 positions were correctly called as true negatives and 339 variants were called correctly as true positives.
For samples with only one or a few genes previously tested in this laboratory, variant information was extracted from this NGS assay for genes previously tested and compared with the previous test result (Table 1). All SNVs and small indels (<5 bp) that are sufficiently covered (ie, with >30× minimum per base coverage) were successfully identified. Known pathogenic variants were compared and the results are shown in Table 1. A mutation in the ARX gene was not identified in the positive control because of low coverage (<30×). Exon 2 of ARX gene is GC rich and is traditionally a region that is difficult to sequence. Greater coverage increases the probability of correctly calling a variant; however, there are platform-specific upper limits to coverage. In targeted regions with low coverage, Sanger sequencing or another method may need to be incorporated in order to maximize sensitivity. In this study, all low coverage exons were sequenced with complementary Sanger sequencing; therefore, this variant was correctly identified.
In summary, analytical sensitivity and specificity for this method were more than 99% for SNV detection. For indel variants, detection sensitivity and specificity were more than 99% for small indels (<5 bp) and variants within nonhomopolymer regions (<7 of the same nucleotide in a row).
Precision refers to the reproducibility or “robustness” of the assay, meaning the ability to obtain the same results from the same sample when the assay is performed repeatedly. For reproducibility, both intrarun and interrun reproducibility should be assessed. To evaluate intrarun precision (repeatability), 3 libraries were prepared from the HapMap DNA sample NA12878 in parallel, each with a unique index. An equimolar amount of each library was pooled and sequenced on the same Miseq flow cell. To evaluate interrun precision (reproducibility), the HapMap NA127878 DNA was captured and sequenced in another independent run. Variants called for each sample/run were compared among the 3 intrarun library samples (NA12878I, NA12878II, and NA12878III) for assessing intrarun repeatability and between the interrun library sample (NA12878) and each of the 3 intrarun samples for assessing interrun reproducibility. Reproducibility was calculated by dividing number of discordant calls by total positions in the region of interest; results are shown in Table 5.
Reference and Reportable Range
Reference range is defined as the range of test values expected for a designated population of individuals (US CFR 493; February 28, 1992). The range is determined by testing a series of specimens from a given population who are known to be free of the disease of concern.23 In the example, reference range is defined as the normal variation of sequence within the population that the assay is designed to detect. Variation in normal individuals can include single-base changes, insertions, deletions, and copy number variation. Reportable range is defined as the portion of the genome for which sequence information can be reliably derived for a defined test system. There may be areas of the targeted regions that cannot be sequenced reliably and thus would be excluded from the reportable range. For example, the targeted panel assay validated here is designed to detect only germline mutations and is not validated for detection of somatic mutations. Based on the validation results and the technical limitations of NGS, variants in homopolymer regions, indels more than 5 bp in size, genes with high homology to pseudogenes or within repetitive regions, and exon level copy number variation were determined to be beyond the reportable range of this assay and thus were excluded for calculation of sensitivity and specificity. Mutations within the promoter regions, deep intronic regions, or regulatory elements are outside of the targeted regions of this assay and thus would not be detected.
Genomic data have revealed the complexity of the human genome, and the concept of 1 gene–1 disease has changed. This has implications in all areas of medicine and is not limited to rare diseases.27 Variant interpretation is typically performed using data from population frequency databases, segregation analysis, mutation databases, reported studies, and putative impact on protein function.
It is recommended that variants be interpreted using the recently published variant classification guidelines.28,29 Those variants that occur at a high frequency (usually greater than 5%) in a population are often filtered out by bioinformatic analysis.29 However, for many rare disorders or for particular genes a frequency of 5% may be too high; thus, more stringent thresholds may be used for filtering if information on prevalence and penetrance is available. For reporting a variant, it is important to determine if the effect of the variant is consistent with the patient's phenotype and also to examine the segregation of the variant within the proband's family (when family members are available).
The American College of Medical Genetics and Genomics first published recommendations for sequence interpretation in 2005, and then again in 2008, with the most recent revision coming in 2015, introducing the 5-term classification system.28–30 These guidelines are specifically directed toward inherited disease testing in clinical laboratories, though they have also been used for somatic variant classification. For population frequencies, data from large-scale sequencing projects, such as the 1000 Genomes Project and projects focused on data aggregation, such as the Exome Aggregation Consortium and Genome Aggregation databases, are now freely available for use in research and diagnostic settings.31,32 The Clinical Genome Resource aims to improve our understanding of genomic variation through data sharing and collaboration, starting with aggregating sequence and structural variants in the National Center for Biotechnology Information's publicly available ClinVar knowledge base.33–35 There are several other useful databases, such as the Leiden Open Variation Database (www.lovd.nl/3.0/home), the Human Gene Mutation Database (www.hgmd.cf.ac.uk/ac/index.php), and disease-specific databases such as the Clinical and Functional Translation of CFTR Mutation Database (cftr2.org) that can be very helpful in obtaining variant information. All publications addressing segregation in families and controls must be carefully reviewed. Functional studies are helpful in determining if the variant impacts normal function or expression. However, these studies may be challenging to interpret because there are no perfect model systems and results may be contradictory among different analyses. The final report should include the variant classification and all the evidence supporting the variant classification, including references and whether the variant(s) detected fully or partially explain the patient's phenotype.29
As the number of sequencing variants grows, additional evidence may warrant variant reanalysis. For example, the access to sequence data on more than 60 000 individuals in the Exome Aggregation Consortium database (exac.broadinstitute.org) has led many variants of uncertain significance to be reclassified as benign. To set appropriate expectations, laboratories may develop policies on the reanalysis of genetic data.
QUALITY ASSURANCE AND QUALITY CONTROL
General quality assurance and quality control recommendations are stated in the Clinical Laboratory Amendment of 1988, and more specific molecular and sequencing quality assurance and quality control recommendations have been articulated by the Clinical Laboratory Standards Institute (MM9-A2, MM20) and the College of American Pathologists, the American College of Medical Genetics and Genomics, and the Association for Molecular Pathology.3,6,22,36
A quality assurance program for NGS testing will assess preanalytic, analytic, and postanalytic processes used from enrichment and sequence analysis through reporting. The program addresses problems that arise in the course of testing, such as events that can affect the test result or nonconformance with the laboratory's own policies and procedures. Documentation includes both review of the effectiveness of corrective actions taken and the revision of policies and procedures intended to prevent recurrence.
Documentation of all testing processes is a critical part of laboratory quality assurance. All standard operating protocols of DNA/RNA sample preparation, fragmentation, library preparation, bar coding (molecular indexing), sample pooling, and sequence generation are documented so that each step and subsequent manipulations can be traced. Metrics and quality control parameters used to assess run performance are also documented. Commonly used metrics include the fraction of bases meeting specified quality and coverage thresholds and average coverage/base and target region (Table 4; Supplemental Table 3). The laboratory should define and document acceptance and rejection criteria for each test step. It is critical to determine and summarize regions that failed analysis (eg, because of inadequate coverage) if they are not covered by orthogonal technologies such as Sanger sequencing. Assuring sample traceability throughout the whole analysis workflow is critical so that sample swaps can be easily detected.
The routine application of a validated bioinformatics pipeline is accompanied by monitoring of laboratory-determined quality control metrics. Divergence from expected quality metrics during the analysis of clinical samples requires investigation and resolution. These metrics are assessed per run/sample as well as routinely to detect trends. An example would be when the bioinformatics output of NGS data demonstrates an insufficient number of sequence reads passing an expected or required base quality score threshold. Deviations may indicate a technical aberration or process failure occurring during technical wet bench procedures or during a step in the bioinformatics pipeline. It is suggested that the clinical laboratory review a summary of the quality scores, metrics, and total number of reads to determine overall quality of the run before start of alignment given the time and other resources required. Quality control procedures are designed to ensure expected test performance, detect assay failure, and provide confidence that a reliable result is generated.
For the NGS targeted panel performed at the Children's Hospital of Philadelphia, the quality control metrics along with the criteria used are listed in Table 4. Preanalytic, analytic, and postanalytic metrics of the wet bench as well as the bioinformatics pipeline are established, providing criteria for beginning to end of the NGS workflow. Metrics are monitored per sample/run and assessed monthly as part of a continuous quality improvement program.
Before implementation, a validation report must be written and approved by the laboratory director. This report should include an introduction to the test; the diseases/genes being tested; a description of the samples, controls, and methodology used including bioinformatics; validation parameters such as precision, specificity, sensitivity, reportable range, and reference range; and clinical validity and utility of the test. A standard operating procedure is composed that includes test indication, intended use, test principle, specimen handling and storage, reagents and controls, equipment, the stepwise assay procedure, results interpretation and report generation, and references. Integration into the clinical workflow involves training technologists who will perform the test. Training technologists comprises not only technical aspects of running the test, but also disease information to aid in the understanding of a result and its interpretation. Report templates for negative, positive, and uncertain results are drafted; however, customization is often performed and determined by the classification of the observed genetic variants. All equipment that is used should be properly installed, inspected, and maintained continually as long as the test is offered. Procedures for instrument, operation, and performance qualification are available and in place. Quality control and quality assurance measures, including proficiency testing and archiving of records, reports, and tested specimens, should be performed. The billing mechanism and budgetary allocations should also be finalized before the test is operational. Appropriate regulatory agencies may need to be notified (and in some cases may require preapproval) before test implementation. All of these measures need to be ready before offering the test. It is important to keep in mind that validation is a continuous process of monitoring, documentation, and improvement. This is especially significant in the continually evolving field of NGS with frequent improvements in technology and informatics tools. Clinical laboratories must therefore carefully balance improvements in test performance with available resources.
The authors would like to thank Mahdi Sarmady, PhD; Kajia Cao, PhD; Laura Conlin, PhD, FACMG; and Hakon Hakonarson, MD, PhD for their support. We thank Patricia Vasalos, BS, and Jaimie Halley, BS, for providing support and coordination for all the next-generation sequencing validation manuscripts in this series; they both are employees of the College of American Pathologists (Northfield, Illinois).
Dr Santani received royalties from Agilent Technologies, was a consultant for Invitae, and has an honorarium from Arcadia University, Cambridge Healthtech Institute. The other authors have no relevant financial interest in the products or companies described in this article.
Supplemental digital content is available for this article at www.archivesofpathology.org in the June 2017 table of contents.
This manuscript is being submitted on behalf of the College of American Pathologists Biochemical and Molecular Genetics Committee and College of American Pathologists Next-Generation Sequencing Project Team.