Detection of variants in hematologic malignancies is increasingly important because of a growing number of variants impacting diagnosis, prognosis, and treatment response, and as potential therapeutic targets. The use of next-generation sequencing technologies to detect variants in hematologic malignancies in a clinical diagnostic laboratory setting allows for efficient identification of routinely tested markers in multiple genes simultaneously, as well as the identification of novel and rare variants in other clinically relevant genes.
To apply a systematic approach to evaluate and validate a commercially available next-generation sequencing panel (TruSight Myeloid Sequencing Panel, Illumina, San Diego, California) targeting 54 genes. In this manuscript, we focused on the parameters that were used to evaluate assay performance characteristics.
Analytical validation was performed using samples containing known variants that had been identified previously. Cases were selected from different disease types, with variants in a range of genes. Panel performance characteristics were assessed and genomic regions requiring additional analysis or wet-bench approaches identified.
We validated the performance characteristics of a myeloid next-generation sequencing panel for detection of variants. The TruSight Myeloid Sequencing Panel covers more than 95% of target regions with depth greater than 500×. However, because of unique variant types such as large insertions or deletions or genomic regions of high GC content, variants in CEBPA, FLT3, and CALR required supplementation with non–next-generation sequencing assays or with informatics approaches to address deficiencies in performance. The use of multiple bioinformatics approaches (2 variant callers and informatics scripts) allows for maximizing calling of true positives, while identifying limitations in using either method alone.
Next-generation sequencing (NGS), or massively parallel sequencing, analysis of myeloid malignancies, including acute myeloid leukemias (AMLs),1–5 myelodysplastic syndromes (MDSs),6 myeloproliferative neoplasms (MPNs),1 and MDS/MPNs,7 has yielded a number of significant advances in the identification of diagnostic, prognostic, predictive, and therapeutic biomarkers for these disorders.1,2,8 Genes and variants relevant to tumor progression, tumor evolution in response to therapy, and minimal residual disease assessment in myeloid malignancies have also been identified via whole-genome or whole-exome sequencing, often in concert with whole-transcriptome sequencing.9–11 Recent advances define AML with biallelic CEBPA mutations as a separate disease entity,12 with patients exhibiting this molecular profile having a significantly better prognosis13 compared with patients with wild-type or single-variant CEBPA cases. Likewise, CALR exon 9 mutations are now considered diagnostic of primary myelofibrosis,12 and CALR type 1–like (c.1099_1150del and similar variants) and type 2–like (c.1154_1155insTTGTC and similar) variants are known to have differential prognostic impact.14 The presence of both SF3B1 and JAK2 mutations is diagnostic and prognostic for refractory anemia with ring sideroblasts with thrombocytosis,15 now classified as a specific subtype of MDS/MPN.12 Variants in BRAF, SMC1A, SMC3, RAD21, STAG2, NRAS, IDH1, IDH2, SRSF2, and SETBP1 have also recently been reported to be relevant for disease subclassification or treatment across multiple hematologic malignancies.1,2,16,17 A number of these clinically important genes are known to contain variants with larger insertions/deletions (eg, FLT3, CALR). Other genes are particularly GC rich (CEBPA), and may therefore pose a challenge to amplification during library preparation and subsequent detection using NGS technology.
Several best-practice recommendations for the validation and implementation of NGS approaches have been developed for the clinical diagnostic laboratory setting.18–20 As with all clinical tests, NGS tests need to be assessed by performance characteristics of analytical sensitivity, specificity, accuracy, reproducibility, linearity, limit of detection, and reportable range.18 Next-generation sequencing testing has additional unique challenges that must be considered, such as genomic regions that are difficult to sequence and that may require complementary non-NGS analytic solutions or bioinformatics solutions to ensure complete analysis. In addition, NGS assays comprise several distinct parts: wet-bench components, bioinformatics approach(es), and clinical interpretation of the variant calls, each of which needs to be considered in designing and conducting NGS assay validation.
In this report, we describe the validation approach for implementation of an NGS diagnostics test for myeloid malignancies in a hospital-based clinical molecular diagnostic setting. We evaluated the performance characteristics and test limitations of the Illumina TruSight Myeloid Sequencing Panel (TSMP; 54 genes, 568 amplicons; Illumina, San Diego, California), and tested it against a range of sample types and from various hematologic malignancies. We assessed the TSMP for gaps in NGS output and developed complementary approaches to ensure evaluation of all clinically relevant regions. Finally, we assessed bioinformatics and variant assessment approaches to streamline the postanalytic workflow for panel implementation.
Assay Design and Genes/Regions Included
The TSMP is an amplicon-based panel targeting regions of 54 genes with known involvement in various hematologic malignancies, including AML, MDS, and MPNs. These genes include variants that provide clinically relevant diagnostic, prognostic, predictive, or therapeutic information. The panel consists of 568 unique amplicons of ∼250 base pairs (bp) in length, with a total genomic footprint covering ∼141 kb, targeting the complete exonic regions of 15 genes and exonic hot spots of 39 genes.
Validation Samples Used
Samples used in the validation of this test included DNA from peripheral blood leukocytes and bone marrow samples from a number of different hematologic malignancy types, including AML, MDS, MPN, and MPN/MDS. Peripheral blood leukocyte and bone marrow samples were extracted using phenol/chloroform or an automated extraction method where DNA purification was performed using 350 μL of bone marrow or whole blood on the MagAttract DNA Blood Midi M48 Kit and processed on the BioRobot M48 workstation (Qiagen, Hilden, Germany). For analysis of several test performance characteristics, we also used DNA extracted from the reference HapMap samples NA12878 and NA19240. The HapMap cell lines NA12878 and NA19240 have been widely characterized on multiple NGS platforms in various clinical laboratories and are used globally as reference materials for evaluating assays in development. The variant SNP profiles for the HapMap samples NA12878 and NA19240 are known and documented in the Genetic Testing Reference Materials program (http://www.ncbi.nlm.nih.gov/variation/tools/get-rm/; accessed August 19, 2016). This repository houses data obtained by whole-genome sequencing, by whole-exome sequencing, or from targeted tests generated from 12 clinical and research laboratories.
NGS Library Preparation and Sequencing
Library preparation was performed using 50 ng of DNA, using the MiSeq Reagent Kit v3 chemistry according to Illumina's standard protocol. Each library preparation includes sample-specific indices, which allow for pooling of libraries prior to sequencing using 2 × 250-bp reads. All validation samples were run on the Illumina MiSeq platform. Run parameters and data output from each run were obtained and compared against specifications outlined by the manufacturer (Illumina). Cluster densities, reads passing filter, and output greater than Q30 values met and exceeded specifications issued by the manufacturer (>50 million reads passing filter per run; cluster density between 1000 and 1500 K/mm2; Q30 > 80%, median data yield per run, 13.9 gigabases).
Data Analysis and Bioinformatics
Data analysis steps were performed using NextGENe v.2.3.1 (SoftGenetics, State College, Pennsylvania), and included read quality trimming, alignment to the reference genome (version hg19), and calling of single-nucleotide variants and short insertions or deletions. Data were viewed on the NextGENe Viewer v2.3.1. Read alignment and variant calling were also performed using the MiSeq Reporter (MSR) v 2.4 (Illumina) software, and annotated using VariantStudio 2.2 (Illumina). Depth of coverage for all samples was obtained from the Genome Analysis Toolkit's21 DepthOfCoverage walker with count_reads setting, which allows overlapping regions to contribute to the count (ie, counts all reads independently, even if from the same fragment); duplicate reads were not marked in the processing of the raw data for amplicon-based panels where reads have the same start and end. Depth-of-coverage analysis was then performed using a custom bioinformatics script. This script parses Genome Analysis Toolkit depth-of-coverage outputs, retrieves coverage values for all the loci, and generates locus-based, interval-based, gene-based, and sample-based graphical distributions of coverage. The script also generated lists of loci, intervals, and genes covered less than 500× in percentiles ranging from 5 to 100 by considering all the samples included in the analysis.
Variants detected in certain genomic regions covering coding regions of genes including IKZF1 and CUX1 were found to be incompletely mapped and annotated on both software systems, NextGENe and VariantStudio. Variants in these genes were manually annotated using alternate sources including Alamut Visual v.2.7.2 software (Interactive Biosoftware, Rouen, France). All samples were further assessed for the presence of the type I22 [c.1099_1150del (p.Leu367fs)] variant in CALR using a laboratory custom-designed CALR detection algorithm, independent of NextGENe and MSR software. Outputs were generated for each sample outlining the number of reads that match to wild-type CALR sequence and the 52-bp deletion variant.
Criteria for Variant Filtering and Selection for Review
Variant lists were evaluated in regions of coverage greater than 100×, with variant allele fraction greater than 5%. Variants that were present in the reference population data sets (1000 Genomes phase 1 release v3.20101123, 1000 Genomes phase 3 release v5.20130502,23 variants in the ESP6500SI-V2 data set of the exome sequencing project [http://evs.gs.washington.edu/EVS/, accessed August 2016], annotated with SeattleSeqAnnotation137, Exome Aggregation Consortium24 release 0.3, Database of Single Nucleotide Polymorphisms25 build 141 GRCh37.p13) at a global minor allele frequency greater than 1% were filtered out from the analysis. Variants that repeatedly occurred in more than 10% of cases and were not located within a mutation hot-spot region or previously reported in the literature were compiled, recorded as suspected repeating artifacts, and excluded from further analysis. All variant calls that met reporting criteria for depth of coverage and allele fraction were investigated on output files from bioinformatics software packages. Variants that matched all reporting criteria but were detected on only one variant caller were selected for verification by orthogonal methods including Sanger sequencing or restriction fragment length polymorphism and assessed in detail to determine the reason for discordance between callers.
Orthogonal Verification Methods
In order to have a high-throughput means to verify NGS-detected variants, we used a laboratory-developed Sequenom MassARRAY platform (Agena Bioscience, San Diego, California) assay, which analyzes 189 variants in 26 genes (hematologic malignancies panel [HMP]). This panel was previously validated using 32 cases that were known positive by other methods for the NPM1 p.Trp288fs, FLT3-TKD, and JAK2 p.Val617Phe variants, along with a panel of 7 cell lines positive for variants in BRAF, JAK2, KRAS, NRAS, and NPM1, with resultant sensitivity and specificity for the HMP of 100% in comparison with non-NGS methods. Other molecular tests routinely used in the clinical laboratory were used for detection of KIT, NPM1, and FLT3-ITD variants as described previously.26,27 Testing of CALR was performed using DNA extracted from peripheral blood or bone marrow and analyzed for insertions or deletions in exon 9 using fluorescent polymerase chain reaction (PCR) followed by fragment analysis. All samples were also amplified by PCR and sequenced in both directions. Data from other NGS assays (TruSeq Amplicon Cancer Panel; Illumina) were also available for a subset of cases,28 and were used for comparing results from overlapping regions.
Validation Samples Profiled
Data from a total of 139 cases were profiled to assess panel performance, and included data from 72 AML cases (51.8%), 10 MDS cases (7.2%), 26 MPN cases (18.7%), 6 controls (4.3%), and 25 samples (18.0%) from other hematologic malignancies (Figure 1, A). We detected 375 variants in 41 genes for the 139 cases in the validation cohort; variant types detected include single-nucleotide variants, insertions, deletions, and indels (Figure 1, B and C).
Coverage Profiles Across the Panel
The number of samples profiled in a single NGS run is dependent on the capacity of the sequencing instrument. We profiled up to 24 samples using the TSMP per sequencing run on the MiSeq, such that more than 95% of the targeted region achieves greater than 500× coverage in every sample. Data output from 4 TSMP sequencing runs, profiling 92 cases, indicated that average coverage per sample on the TSMP ranged from 2823× to 4801× (median, 3758×). To identify gaps in coverage, genomic regions with coverage of less than 500× in at least 90% of samples were extracted and analyzed. This includes portions of coding regions in BCOR, CEBPA, CUX1, GATA2, HRAS, RUNX1, and STAG2. Chromosome coordinates for all identified low-coverage regions were compared against those sites with recurrent mutations (>50 occurrences) reported in the COSMIC database in order to determine if there were mutation hot-spot regions located within the region of poor coverage. This analysis indicated that the low-coverage regions identified from the panel do not contain known mutation hot spots in any of the genes. Data from 4 TSMP sequencing runs were also compared at the amplicon level using average coverage per amplicon. Data indicated that 94.7% (538 of 568) of amplicons were covered at greater than 500×; only 5.3% (30 of 568) of amplicons had a coverage less than or equal to 500× (Figure 2, A). Of these, 2.14% (12 of 568) of all amplicons failed to reach a coverage depth of 100×. Gene regions included in these low-coverage amplicon sets (<100×) were consistent with those reported above (Figure 2, B).
Use of Orthogonal Methods for Verification
Seventy-one of 139 cases (51.1%) from the TSMP validation sample set were tested on one or more of the following orthogonal platforms: Sanger sequencing, HMP, the Illumina TruSeq Amplicon Cancer Panel, or routinely used and validated single-gene clinical laboratory tests as described in Methods. The data obtained from different platforms were compared to determine concordance among tests within mutually covered genomic regions. In the 71 samples, variant calls for 162 of 163 mutually covered single-nucleotide variants (99.4%) and insertions/deletions of up to 33 bp were concordant between the TSMP and any one of the other orthogonal testing platforms. One variant was identified as a FLT3 tyrosine kinase domain (TKD) mutation positive (typically codon 835) in the single-gene FLT3 assay, but reported by NGS as a 3-bp deletion that resulted in an in-frame deletion of codon 836. A secondary verification of this variant by Sanger sequencing to clarify this discrepancy could not be performed as there was insufficient sample.
Test Performance Characteristics
We evaluated a range of test performance characteristics for the TSMP, including assessment of interfering substances, analytical sensitivity, analytical specificity, reproducibility and repeatability, accuracy, linearity, and limit of detection, defined as in Clinical and Laboratory Standards Institute MM09-A2, 2014 edition.47
Analytical Sensitivity and Specificity
Analytical sensitivity is defined as the proportion of biological samples that have a positive test result or known variant and that are correctly classified as positive, that is, the likelihood that the assay will detect a sequence variant, if present; analytical specificity is defined as the ability of a test to detect only the target analytes that is, the probability that the assay will not detect a sequence variation if not present. Samples for which variants were present and called within regions that are mutually covered on TSMP and HMP were identified as true positives. A total of 85 true-positive variants from 38 cases were detected mutually between the TSMP and the HMP. There were no false negatives for variants with greater than 5% allele fraction in regions covered at depth greater than 500× (analytical sensitivity = 100%; 95% CI, 95.75%–100%). Thirty-three samples previously tested on the HMP and known to be negative for variants were also tested on the TSMP. All samples (33 of 33) had no reportable variants within the regions covered in both panels (analytical specificity = 100%; 95% CI, 89.42%–100.00%). There were no variants in this data set that met the criteria for false positives.
Test precision is defined as the closeness of agreement between indications or measured quantity values obtained by replicate measurements on the same or similar objects under specific conditions, that is, the degree in which a repeated measurement gives the same result. Test precision includes 2 concepts: repeatability and reproducibility. Reproducibility is the degree to which the same sequence is derived when sequencing is performed by multiple operators, with multiple lots of reagents, by more than one instrument, and, when applicable, from site to site (also known as robustness). Repeatability is the degree to which the same sequence is derived in sequencing the same reference sample many times under the same conditions. Test precision was assessed via analysis of the same samples run multiple times in the same NGS run, on different runs, and between technologists.
Interrun Sample Reproducibility
To test interrun sample reproducibility, allele frequencies for 31 variants from 9 samples profiled in duplicate on different runs were compared. There was 100% concordance in the ability to detect variants between the 2 runs for all samples (Figure 3, A; R2 for correlation between variant allele frequencies = 0.94).
To test intertechnologist reproducibility, selected samples were profiled by 2 different technologists on 2 different runs. Allele frequencies for 23 variants from 6 samples were compared. There was 100% concordance in our ability to detect variants between the 2 runs (Figure 3, B; R2 for correlation between variant allele frequencies = 0.95).
Intrarun Sample Reproducibility
To test intrarun sample reproducibility, 1 sample was profiled 3 times in the same run using 3 different library preparations (index sets 1–3; Figure 3, C). There was 100% concordance in the detection of all 53 variants between the samples.
Test Accuracy, Linearity, and Limit of Detection.—Test accuracy is defined as the closeness of agreement between a measured quantity value and a true quantity value—the degree of agreement between the nucleic acid sequences derived from the assay and a reference sequence. Linearity is the ability of the test to return values directly proportional to the concentration of the analyte in the sample. Limit of detection refers to the lowest amount of analyte measured that can be detected; the minimum detectable allelic fraction in a given sample. To test the accuracy of the TSMP, we used the reference Coriell cell lines NA12878 and NA19240 and compared concordance between the variants called by our informatics pipeline against those reported by the GetRM project. All publicly reported high-quality variants (true-positive variants identified on 2 distinct technologies) listed in the Genetic Testing Reference Materials project for the sample NA12878 were documented and compared with our list of variants. All high-quality variants reported for NA12878 were detected by the TSMP. All high-quality variants published for NA19240 were also identified on the TSMP (data not shown). There was 100% concordance between the variants detected on the TSMP and those previously reported in studies using the HapMap cell lines.
We also titrated the HapMap cell lines NA12878 and NA19240 against each other in a dilution series of 1×, 2×, 4×, 10×, and 100×. All dilutions were assayed using the TSMP to assess variant profiles, identify presence of known variants, and establish limit of detection. Allele frequencies of variants detected in both cell lines demonstrated a linear decline consistent with sample dilution. The TSMP therefore detects variants at frequencies directly proportional to the concentration of the analyte in the sample (Figure 4, A). Similar results were obtained using the cell line NA12878, using a set of 6 known variants (data not shown). The expected and observed allele frequencies for 21 previously described variants uniquely present in cell line NA19240 were identified and compared across 6 dilution levels. All variants (21 of 21) were present and reliably detected at dilutions down to a minimum of 5% variant allele fraction (1:9 dilution). All genomic positions were covered at a minimum depth of coverage of 500× (Figure 4, B). Our data also identified an additional variant that followed the same dilution and variant allele fraction as described above, but this was not a previously reported variant in this cell line. This new variant position was covered at a lower depth of coverage (average 480×). The relative lower depth of coverage at this position may explain why it may not have been included in previously reported data sets.
Additional Analytical Procedures Required for Clinically Relevant Variants
Identifying Variants in CEBPA
The presence of high GC content (75% in coding region) of CEBPA poses a challenge to sequencing this gene. A large proportion (5 of 6 amplicons) covering the clinically actionable CEBPA gene did not meet a minimum depth of coverage of 500× (Figure 5, A and B). We mapped the 6 amplicons covering the CEBPA regions against previously reported variants located in the gene and reported in the COSMIC database (Figure 5, C). Previously reported CEBPA variants are reported most frequently within codons 290 to 359 (Figure 5, C); the amplicon covering this region performed well to detect variants within the range (Figure 5, B). Amplicons with the lowest average depth of coverage were located in the regions between codons 88 and 160 and between codons 218 and 290 (Figure 5, B). We therefore determined that use of NGS for CEBPA assessment needs to be supplemented with Sanger sequencing, using a modification to previously published primer sets.29
Analysis of Variants in the ASXL1 Homopolymer Region (ASXL1 c.1934dupG; p.Gly646fs)
The ASXL1 c.1934dupG variant occurs in a homopolymer run of 8 G nucleotides, and its clinical relevance is controversial in the literature.30 This variant has been reported to occur as a PCR artifact resulting from polymerase slippage.30 It has also been detected in the normal population (it is reported in the Exome Sequencing Project at frequencies of 2.27%–3.19%, depending on the population, and in the Exome Aggregation Consortium data set at minor allele frequencies of 0%–0.22%), and may be a potential germline variant. There are recent reports of this variant as a somatic gain-of-function variant relevant to disease31 ; in some cases, groups have verified validity of the variant by repeated sequencing, at times using different enzymes and primer sets.32,33 Our data demonstrate that this variant occurs at a low variant allele fraction (∼5%) in a large fraction of cases in the TSMP data set, where it is a suspected artifact (Figure 6, A). However, in 30 of 490 cases tested (6.1%), this variant was detected at a higher variant allele fraction (15%–35%; Figure 6, B inset), where it is a true-positive call. Given that ASXL1 is actionable in a number of disease sites, we sought to verify this variant in our cases where it appeared at a higher variant allele fraction, in the process of evaluating this variant for reporting. Unlike variants that occur repeatedly in our data set and are associated with either low variant allele fraction or low depth of coverage, such as STAG2, c.2124A>T (p.Leu708Phe) (Figure 7, A through C), the ASXL1 c.1934dupG variant has a consistently high depth of coverage (median, 5034×; range, 2351×–11 589×), and a bimodal distribution in variant allele fraction (Figure 6, B).
Analysis of Insertion Variants in FLT3 (FLT3-ITDs)
We assessed TSMP's ability to detect clinically actionable FLT3 insertions (found in ∼25% of AMLs), using 31 known positive cases with FLT3-ITD sizes ranging from 24 to 90 bp. Next-generation sequencing detected 15 of 31 positive cases, including 4 cases with FLT3 insertion sizes greater than 25 bp, up to a maximum size of 33 bp (assay sensitivity = 48%; specificity = 100%), necessitating the use of other orthogonal assays to detect larger FLT3 insertions.
Analysis of the Common Type 1 Deletion Variant in CALR c.1099_1150del (p.Leu367fs)
The 52-bp deletion in CALR is a clinically actionable variant that we identified in our data set by using a specific script designed for this purpose, in addition to variant callers used in our laboratory at the time of assay validation. Read data from a total of 387 samples were directed through the script. Analysis indicated that a total of 12 cases were positive for the CALR deletion variant. Ten cases with available DNA were tested by Sanger sequencing or restriction fragment length polymorphism. All (10 of 10) tested variant positive cases identified by the script were successfully verified by orthogonal methods. We further verified that all positive cases that were identified by the script were also detected by the MSR variant caller after increasing the MSR default insertion/deletion size limit to 55 bp. Analysis of CALR variants in the laboratory currently is performed with the use of the MSR caller alone.
Comparison of Bioinformatics Approaches (NextGENe and MSR—Variant Studio)
Variant output from 2 software systems (NextGENe v.2.3.1 and MSR variant caller, annotated using VariantStudio 2.2) was compared for analysis of reportable variants selected for review. NextGENe was previously validated for use with the amplicon-based NGS panels in the laboratory,28 and was therefore taken as the standard for comparison against MSR and VariantStudio. Using a set of 183 cases containing a total of 581 reviewed variants, we determined that 97% (567 of 581) of reported variants were detected by both software systems. Eight of 581 variants (1.5%) were not detected by the MSR software; all 8 were insertions/deletions greater than 25 bp in length, and therefore the discordance between software systems was attributed to the use of default instrument/software settings that restricted calling of insertions/deletions up to a maximum of 25 bp. Upon review and modification of the MSR indel size settings, all 8 previously undetected variants were identified by this software. Six of 581 variants (1%) were detected solely using the MSR caller, and not NextGENe. Upon reviewing the alignment data for all variants, we determined that all 6 variants were located toward the ends of amplicons, within regions that were covered by primer/probe sites from alternate amplicons that overlapped. Sanger sequencing confirmed that all 6 variants were true-positive calls. As the NextGENe analysis retained read sequence data located within primer sites for all amplicons, the target coverage in the regions containing these variants was oversampled, and therefore led to lower variant allele frequencies, which were therefore filtered out. In order to streamline analyses, we selected variant calls from a single variant caller for implementation. Because the MSR caller was more effective at capturing variants located at selected ends of amplicons, and there were no variants missed by the use of MSR alone, this was selected as a single caller for variant reporting.
Variant Nomenclature for Some Genomic Regions May Be Incomplete in Commercially Available Software Tools
Using a combination of VariantStudio and NextGENe analysis methods, we were able to detect selected genomic regions that appeared to be incompletely annotated, including regions within IKZF1 and CUX1 that appeared to have detected variants but lacked associated protein amino acid change or complementary DNA changes. These regions required manual annotation by overlapping read alignments against known reference sequences for the genes of interest using software such as Alamut. Cartagenia Bench Lab NGS (Agilent, Santa Clara, California) was also able to provide complete annotations for these regions, with the use of the most updated reference sequences.
Methods for Automating Variant Filtering and Interpretation
As part of the validation and implementation process for the panel, we sought to design and evaluate methods for efficient prioritization of variants for interpretation and reporting, including methods to identify and remove polymorphisms in the absence of matched normal samples. We designed a triaging algorithm that could be used for automation to identify potentially clinically relevant variants from tumor-only analysis of NGS data in hematologic malignancies. Variants called using the Illumina MSR software package for each sample were uploaded into a commercially available tool, Cartagenia Bench NGS v4.2, for analysis (Figure 8). Of all variants detected by NGS (median, 427 variants/case; range, 338–643 variants/case), 35% (median, 150 variants/case; range, 125–172 variants/case) passed all MSR quality criteria. Applying a variant allele fraction threshold resulted in a median of 24 variants/case (range, 11–35 variants/case). Reporting was further restricted to well-covered, exonic nonsynonymous, intronic splice site, and known deleterious synonymous variants, resulting in a median of 2 variants/case for manual review (range, 0–10 variants/case; Figure 9, A). When combined with our internal data set of more than 600 unique variant interpretations across 8 hematologic malignancies, this approach enabled the review and interpretation of previously known variants, and allowed for prioritizing novel variants in order of clinical actionability.34 By intersecting the sample variant file with known actionable variants previously interpreted in the laboratory, we were able to prioritize highly actionable variants even at low variant allele frequencies for reporting. Comparison of reportable variant output between the manual review process and the software-based automated review process indicated that all variants identified using the conventional variant review process were detected using the automated process if they were present in the input file. This approach recursively uses our laboratory-developed variant knowledge base, and enables us to organize and use variant interpretations easily for generation of clinical and research reports.
Germline variants are not called from our analysis in the absence of data from matched normal samples, and common polymorphisms were identified using multiple reference population databases. The number of variants identified as polymorphisms using reference databases (National Heart, Lung, and Blood Institute Exome Sequencing Project; Database of Single Nucleotide Polymorphisms; Exome Aggregation Consortium; 1000 Genomes Project, phases 1 and 3) singly and in combination are indicated (Figure 9, B). Data are from a cohort of 30 cases (16 AML, 5 MDS, 3 MPN, and 6 others), and are represented as mean and standard deviation per case.
Considerations About Sample Types in Variant Profile Interpretation
As we anticipated receiving both blood and bone marrow samples for testing on the TSMP, we sought to determine whether sample type impacted sample performance on NGS. DNA extracted from 34 matched cases with both blood and bone marrows as starting material were tested as input for library preparation and sequencing. Both sample types demonstrated equal performance in the assay, generating comparable total numbers of reads, aligned reads, and reads on target. No differences were detected in average depth of coverage or in the fraction of bases with coverage greater than 500× in the 2 sample types (data not shown).
To determine whether sample origin affected mutational abundance and variant profiles, 68 blood- and bone marrow–derived samples from 34 matched patients were run and variants compared (Figure 10, A and B). Thirteen of 34 cases (38%) had at least 1 variant identified in one sample type but not the other. Ten of 13 cases had a total of 14 variants that were identified in the bone marrow, but not the corresponding blood sample (Figure 10, A and B). Two of 13 cases had variants that were discordant in both the blood and the bone marrow sample; both of these cases had blood and bone marrow samples collected at different times (1.5 and 7 months apart). One case had a variant detected in the blood that was not detected in the bone marrow (Figure 10, A and B). Overall this indicates that although samples derived from both blood and bone marrow perform equally well in the assay, the variant profiles can be unique to each sample type, and may be different if the sample collection is temporally separated. Tumor cellularity is also expected to vary between the blood and bone marrow compartments, and is also likely to affect variant profiles.
Confirmatory Testing in Specific Cases
In the bioinformatics analysis and variant assessment workflow, we noted examples of variants where confirmatory testing was required for accurate nomenclature. This was particularly true for indel calls (such as the examples illustrated in Figure 11), where there were 2 alignment possibilities and the call was associated with strand bias. Sanger sequencing was required in these cases to clarify the technical accuracy of the call and its alignment and subsequent nomenclature. Sanger sequencing also proved beneficial for confirming the presence of insertion/deletion calls observed close to amplicon ends (within 5 bp). In the example outlined in Figure 12, a deletion of 2 nucleotides was identified in close proximity to the end of 2 amplicons. Sanger sequencing this region confirmed the deletion in only 1 of the 2 nucleotide positions. Other complexities include the presence of multiallelic sites (identification of more than 2 variant alleles at the same nucleotide position; Figure 13); Sanger sequencing of this region confirmed the presence of 2 different alleles at the same nucleotide position, in addition to the wild-type allele, and also confirmed the presence of the insertion event that was located downstream of the detected triallelic variant site.
In this report, we present details of applying previously described guidelines18,19 to the validation of the Illumina TSMP in a clinical genome diagnostics setting. We highlight the performance characteristics of the wet-bench analytic test, as well as additional analytic procedures for clinically relevant variants and the use of complementary non–NGS-based tests to address some of the limitations associated with NGS analysis. Furthermore, we discuss the integration of the wet bench analytical test with bioinformatics, variant assessment, and variant interpretation steps in the overall workflow. We also describe some examples of complex variant analyses and demonstrate the utility of confirmatory or alternate methods in select scenarios.
Others have previously reported on the validation of custom NGS panels for hematologic malignancies.20,35–37 These studies have shown the range of application of NGS in the context of molecular diagnostics for hematologic malignancies; here, we add to this body of work by describing, in detail, key considerations around the integration of wet-bench, data analysis, and variant assessment approaches, which are all required for successful test implementation. A comprehensive validation approach needs to establish both test performance characteristics and test limitations. This approach allows for the appropriate application of the test to routine diagnostic conditions. For example, evaluating the ability of TSMP to detect variations in CEBPA enables an assessment of whether a separate test for CEBPA is needed. Because of the lack of appropriate coverage across all relevant exons of this gene, we chose to design a Sanger sequencing assay to complement the NGS analysis. Other groups have identified alternative approaches to improve sequence coverage at the CEBPA locus, by long-range PCR to capture the entire CEBPA exonic region followed by Nextera (Illumina) library preparation and inclusion for sequencing as part of the TSMP workflow.38
The TSMP was also not capable of detecting FLT3-ITDs greater than 33 bp in size because of limitations in the amplicon-based technology that restrict the generation/sequencing of amplicons containing large insertions as well as limitations in bioinformatic tools in aligning and calling larger insertions/deletions. There are efforts to improve upon informatics tools that can be used for detection of insertion/deletions to call indels up to a size limit of 102 bp.39 Although these methods improve the variant calling rate at the FLT3 locus, they are unable to identify those indels sized greater than that of the sequenced read pair, and in those situations, routine laboratory methods remain the gold standard for detection and reporting of these changes.
Analysis of NGS data also requires an evaluation of repeatedly occurring variant calls and suspected artifacts. We include an example of the ASXL1 c.1934dupG (p.Gly646fs) variant that appears as an artifact at low variant allele frequency and a true somatic variant at higher variant allele frequencies, because of the presence of sequencing artifact as a result of the 8-bp homopolymer that is located at the mutation site. We include an assessment of overall variant allele fraction, depth of coverage, call quality, and the frequency with which the variant appears in the data set to identify potential recurring artifacts and distinguish these from true somatic calls. Because of the high background at this site, the imposed detection limit for this locus was different from that of other regions of the panel, and in line with published reports.20
Our current system of reporting includes verification of selected variant types: variants that occur close to the end of the target amplicons and complex indels in actionable genes should be sequence verified. Any variant that is detected in a homopolymeric region is verified, with close attention to the corresponding negative control sample to ensure that reported frequency of the variant is adequately above background signal (noise) detected within the region of interest. Minor disease clones may also be relevant to disease progression and patient management.9–11 The NGS approach is capable of reproducibly identifying variants with a lower limit of detection of 5%; however, Sanger sequencing is not able to reliably verify variants at this level. Although not specifically addressed in this validation, potential solutions would be the incorporation of additional orthogonal tests, such as droplet digital PCR for relevant variants, with lower detection limits, into the verification workflow.
The density of variants detected by TSMP also necessitated a review of bioinformatics and data analysis practices to ensure repeatability and accuracy of analysis. Automated variant triage and assessment using commercially available tools such as Cartagenia Bench Lab NGS enabled rapid identification of reportable variants. This tool also provided an in silico solution to the absence of a matched normal by using multiple population databases, in combination, to identify and eliminate common polymorphisms.
Finally, we assessed the applicability of our previously published somatic variant classification system34 to variants identified in hematologic malignancies during the assessment and interpretation of clinically reportable variants. In many cases, for genes without mutational hot spots, the presence or absence of a variant in the gene, rather than the variant itself, is relevant to actionability. In the context of our classification system, variants in these genes are identified as class 3, and may be more biologically relevant than class 2 variants (known mutational hot spots in a gene that are known to be actionable in other indications). Furthermore, a larger body of evidence is available for the actionability of genes and variants in hematologic malignancies in the context of diagnosis, prognosis, and/or outcomes in response to treatment, impacting the manner of interpretation and classification of the variant being assessed. Finally, some reports suggest that the actionability of given variant is dependent upon the molecular profile of that patient; that is, a variant may be clinically actionable in the presence or absence of other specific changes.1,2 For example, a patient with biallelic CEBPA exhibits an actionable variant profile comprising 2 variants formally classified as class 3A; either variant in isolation, however, is not actionable.40,41 Several other somatic variant classification systems have been recently reported in the literature.42–46 Although these classification systems are adaptable to the hematologic malignancy context, applying those focused more specifically on drug trials, as well as those focused on variant-level interpretation and classification, poses greater challenges.
In summary, we describe the validation of a commercially available NGS panel with application to hematologic malignancies. We assessed the panel for its performance characteristics—analytic sensitivity and specificity; reproducibility and repeatability; test accuracy, linearity, and limit of detection—as well as for genomic regions that required additional analytic approaches in order to implement successful, comprehensive testing. We also highlighted considerations around bioinformatics analysis and requirements for automation of variant assessment that we undertook in order to streamline the workflow for this panel in the diagnostic setting. In evaluating an NGS panel for validation and implementation, we recommend that diagnostic laboratories consider a comprehensive evaluation of the test against the full scope of its intended application, and implement it alongside appropriate complementary testing to ensure no loss of clinically actionable information.
Funding for this work was provided by the Princess Margaret Cancer Foundation, Genome Canada (Genomic Applications Partnership Program), and the Ontario Genomics Institute. We thank Patricia Vasalos, BS, for providing support and coordination for all the NGS validation manuscripts in this series; she is an employee of the College of American Pathologists (Northfield, Illinois).
The authors have no relevant financial interest in the products or companies described in this article.