Context.—Most current proficiency testing challenges for next-generation sequencing assays are methods-based proficiency testing surveys that use DNA from characterized reference samples to test both the wet-bench and bioinformatics/dry-bench aspects of the tests. Methods-based proficiency testing surveys are limited by the number and types of mutations that either are naturally present or can be introduced into a single DNA sample.

Objective.—To address these limitations by exploring a model of in silico proficiency testing in which sequence data from a single well-characterized specimen are manipulated electronically.

Design.—DNA from the College of American Pathologists reference genome was enriched using the Illumina TruSeq and Life Technologies AmpliSeq panels and sequenced on the MiSeq and Ion Torrent platforms, respectively. The resulting data were mutagenized in silico and 26 variants, including single-nucleotide variants, deletions, and dinucleotide substitutions, were added at variant allele fractions (VAFs) from 10% to 50%. Participating clinical laboratories downloaded these files and analyzed them using their clinical bioinformatics pipelines.

Results.—Laboratories using the AmpliSeq/Ion Torrent and/or the TruSeq/MiSeq participated in the 2 surveys. On average, laboratories identified 24.6 of 26 variants (95%) overall and 21.4 of 22 variants (97%) with VAFs greater than 15%. No false-positive calls were reported. The most frequently missed variants were single-nucleotide variants with VAFs less than 15%. Across both challenges, reported VAF concordance was excellent, with less than 1% median absolute difference between the simulated VAF and mean reported VAF.

Conclusions.—The results indicate that in silico proficiency testing is a feasible approach for methods-based proficiency testing, and demonstrate that the sensitivity and specificity of current next-generation sequencing bioinformatics across clinical laboratories are high.

The complexity of next-generation sequencing (NGS) methods and the range of genetic variants they can detect have created novel quality management issues. The analytic phase of NGS differs most from traditional laboratory assays in that it is divided into 3 separate, operationally distinct components,1  namely (1) sequencing platforms (which, depending on the vendor, require different assay designs to optimize detection of different types of variants); (2) library preparation steps (which represent the so-called wet-bench part of NGS, and are usually structured around target enrichment using either amplification-based or hybrid-capture–based assay designs); and (3) bioinformatics pipelines (the so-called dry-bench part of NGS, which must be optimized for the platform from which the data are generated and for the types of variants an assay is intended to detect).

The fact that quality control and quality assurance activities for NGS must address both the wet-bench and dry-bench components of the analytic phase of testing raises several unique challenges for proficiency testing (PT) and external quality assessment as mandated by the Clinical Laboratory Improvement Amendments of 1988.2,3  Analyte-specific PT programs are the traditional approach to external quality assessment, but the number of genes and range of mutations that are routinely evaluated via NGS-based tests make this approach untenable in routine clinical practice. In contrast, methods-based PT (MBPT) methods are ideally suited to NGS-based tests.4  Leveraging the concept of MBPT, the College of American Pathologists (CAP) launched, in 2015, the first MBPT specific for NGS-based detection of germline variants, and in 2016, CAP will introduce 2 additional MBPTs for the detection of somatic variants based on genetically engineered specimens that harbor specific somatic “hot-spot” variants observed in solid tumors and hematologic malignancies. These 2 PT surveys are more comprehensive than prior molecular diagnostic PT and challenge both the wet-bench and bioinformatics/dry-bench aspects of NGS tests. However, these PT surveys have practical limitations, including the expense and difficulty involved in generating and characterizing the PT material. Further, it is not possible to generate samples that harbor the full spectrum of variants (ie, single-nucleotide variants [SNVs], indels, copy number variants, and structural variants such as translocations) and range of variant allele fractions (VAFs) that are needed to fully assess germline and somatic NGS assays being used in clinical practice. In addition, the presence of sequence artifacts resulting from the recombinant techniques used to produce many engineered DNA vectors and cell lines57  may confound analysis because the artifacts are not present in actual patient specimens. Given these limitations, additional complementary approaches need to be developed to assess the capabilities of NGS assays, and in this context, we evaluated the concept of in silico PT (ISPT).

So-called ISPT is focused on evaluating only the bioinformatics/dry bench component of NGS assays. In the ISPT approach, sequence data from a well-characterized specimen are manipulated by computerized algorithms to introduce a spectrum of sequence variants. The resulting simulated data files are used as an MBPT to challenge an NGS test's bioinformatics pipeline from alignment through variant detection and annotation. This study was undertaken to demonstrate the feasibility of ISPT using 2 commonly used and commercially available amplification-based NGS molecular oncology targeted gene panels as the experimental model.

Reference Sequence

The CAP reference genome (unpublished data, April 26, 2016) was sequenced using the Illumina TruSeq Amplicon Cancer Panel (Illumina, Inc, San Diego, California) and the Ion Torrent AmpliSeq Cancer Hotspot Panel v2 (ThermoFisher Scientific Inc, Waltham, Massachusetts) on an Illumina MiSeq sequencer and Ion Torrent PGM, respectively. Ion Torrent sequencing used the 316 chip; MiSeq sequencing used 2 × 150-bp reads. The average depth of sequencing was 10 072× and 763× on the Ion Torrent and MiSeq, respectively.

In Silico Mutagenesis

A custom locus walker (MutationMaker v0.3) written in the Java programming language using the Genome Analysis Toolkit (GATK v1.6)8,9  was used to insert SNVs and small indels into the sequence files as outlined in Figure 1. Briefly, reads are first mapped to the hg19 reference10  in a quality-weighted manner using BWA-mem (Illumina) or T-map (Ion Torrent); for this alignment hard clipping is disabled, and adapter sequences are retained and soft clipped. The resulting Binary Alignment/Map (BAM) files are sorted and indexed using Picard tools,11  and specific point mutations are then introduced into the BAM files via a BED file containing user-supplied mutations (specified by chromosome, position, nonreference DNA sequence, and target VAF as input).

MutationMaker uses a locus walker with a read-backed pileup to iterate over all target genomic locations and mutate the desired proportion of reads at that position, while preserving the general error structure and sequence base qualities present in the original data file. For a particular position, overlapping reads are chosen at random for mutagenesis, and the desired base(s) are added or deleted; quality scores for the inserted bases are simulated based on the quality of adjacent bases and bases that were removed. For Ion Torrent–generated files, flow space data are altered to be consistent with the introduced mutations; for Illumina sequencing data, the VAFs of inserted mutations are based on unique (nonduplicate reads).

The MutationMaker program outputs randomly ordered, mutagenized FASTQ and/or unaligned BAM file(s) that, with the exception of the inserted mutations, are indistinguishable from the original input files. Data generated by MutationMaker are then remapped, and variants called using the Torrent Suite or the Genome Analysis Toolkit to ensure that the added mutations are detected at the indicated VAFs. Further, the error logs and output files from the read mapping and variant calling steps are checked to ensure there is no evidence that the files have been altered.

Design

Two ISPT challenges were designed. In the first (Table 1), 26 variants were introduced, including 24 SNVs (VAFs 10%–50%) and 2 deletions (2 and 15 bp); this challenge was distributed to 3 laboratories using the Ion Torrent sequencer and 2 laboratories using the MiSeq. In the second challenge (Table 2), a total of 26 variants were introduced, including 22 SNVs (VAFs 10%–50%), 1 deletion (of length 18 bp), and 3 dinucleotide substitutions; this challenge was distributed to 4 laboratories using the Ion Torrent and 1 laboratory using the MiSeq. All mutations introduced into the sequence files were modeled from actual somatic mutations reported in the COSMIC database, and the same mutations were introduced into the sequence files from both vendor platforms. The resulting sequence files were distributed to participating laboratories electronically as either paired FASTQ files (Illumina platform) or unaligned BAM files (Ion Torrent platform).

Participating laboratories (all of which are performing clinical NGS of oncology specimens in CAP-accredited, Clinical Laboratory Improvement Amendments of 1988–licensed laboratories) were blinded as to the number, type, location, and VAF of the inserted mutations. The laboratories downloaded the simulated files from the central portal, applied their validated bioinformatics pipeline to align the data and call variants, and reported their results via a standardized form (variants [in g. syntax] and VAFs were reported).

The first challenge was evaluated by 4 laboratories, including 2 running the Ion Torrent sequencer only, 1 running the MiSeq only, and 1 laboratory that used both technologies; a total of 26 variants were introduced into the data files, including 24 SNVs (VAFs 10%–50%) and 2 deletions (2 and 15 bp). On average, 23.2 of 24 SNVs (range, 21–24) and 1.6 of 2 deletions (range, 1–2) were correctly identified (Figure 2, A). The second version of the challenge included 4 Ion Torrent–based laboratories and 1 MiSeq–based laboratory; in this challenge, a total of 26 variants were inserted into the sequence files, including 22 SNVs (VAFs 10%–50%), 1 deletion (size 18 bp), and 3 dinucleotide substitutions. On average, 20.8 of 22 SNVs were correctly identified (laboratory range, 19–22), 0.6 of 1 deletion (laboratory range, 0–1), and 3 of 3 dinucleotide substitutions (Figure 2, B).

Across both challenges, the most commonly missed SNVs were those with low simulated VAFs (10%–15%), accounting for 2 of 4 missed SNVs in the first challenge and 4 of 6 missed SNVs in the second challenge. Many of the remaining missed SNVs with higher VAFs were noted to occur adjacent to single-nucleotide polymorphisms and triggered single-nucleotide polymorphism filters in some laboratories, resulting in false-negative calls. In the first challenge, the 2-bp deletion (simulated VAF = 50%) was missed by 2 of 5 laboratories, whereas the 15-bp deletion was correctly called by all laboratories; it is unclear from laboratory feedback why the smaller deletion was missed. In the second challenge, the 18-bp deletion (simulated VAF = 40%) was detected by 4 of 5 laboratories.

Although not all laboratories include VAFs in their clinical reports, we sought to determine the level of reported VAF concordance in this data set. As VAF determination is highly subject to platform biases, we analyzed only data generated on the Ion Torrent platform, which had the greatest number of cases available for comparison. In the first challenge, the median absolute difference between the reported versus simulated VAF across all substitutions was 0.69% (range, 0%–19%). Across deletions, the median absolute difference between the reported versus simulated VAF was 2.1% (range, 0.6%–6.7%; Figure 3, A). In the second challenge, the median absolute difference between the reported versus simulated VAF across all substitutions was 1% (range, 0%–20%). Across all dinucleotide substitutions, the median absolute difference between simulated and reported VAFs was 1% (range, 0%–2%), and for the simulated deletion, the median absolute difference in VAFs was 0.5% (range, 0%–2%; Figure 3, B).

The Clinical Laboratory Improvement Amendments of 1988 mandate PT for external quality assessment as part of the laboratory accreditation process,2,3  although the precise rules and regulations that govern PT continue to evolve. Many PT programs are based on an individual analyte, and are appropriately termed analyte-specific or disease-specific PT programs. The utility of analyte-specific approaches for DNA analysis has been well documented,1214  and laboratories that do not perform disease-specific surveys have more errors than laboratories that do.15  However, given the number of genes that are routinely evaluated in clinical practice by NGS-based approaches, and the range of mutations for which testing is performed by NGS, it is virtually impossible for laboratories to follow an analyte-specific PT approach in routine clinical practice. For this reason, MBPT paradigms have been developed that are centered on the method of analysis rather than the specific analyte being tested.4,16  MBPT has some distinct advantages over analyte-specific approaches. MBPT makes it possible to provide comparisons among laboratories for dozens (if not hundreds or thousands) of genes by very complex methods such as NGS, and makes it possible to evaluate proficiency in detection of a wide range of variants. In addition, laboratories that participate in the MBPT challenges are not penalized for the inability to detect a sequence variant that lies in a region outside the scope of their validated test, or types of sequence variants that are not validated within their NGS approach. The MBPT approach has been endorsed by CAP, the American College of Medical Genetics and Genomics, and the Centers for Medicare and Medicaid Services.4,16 

In this context, the fact that there are 3 independent aspects of NGS (sequence platform, wet-bench protocols, and bioinformatics/dry-bench analysis of the sequence reads) complicates surveys designed for PT of NGS assays, whether via an analyte-specific or a methods-based paradigm. The emphasis to date has been on the development of comprehensive PT surveys that evaluate all 3 aspects of an NGS test based on well-characterized genomic DNA samples, and have generally used nucleic acids of 2 types. The first type, synthetic DNA fragments, has particular advantages because it can be designed to incorporate specific sequence variants, at known ratios, at known positions, and in known allelic ratios, to simultaneously evaluate many aspects of not only platform performance, but also library preparation and bioinformatics analysis.17  The second type is genetically characterized cell lines; because cell lines are an inexhaustible reagent, and because formalin-fixed, paraffin-embedded cell blocks can easily be produced from cell lines, they are a particularly useful source of reference material for PT application in molecular oncology. It is worth noting that both the Genetic Testing Reference Materials Coordination Program18  of the Centers for Disease Control and Prevention and the National Institute of Standards and Technology19  have developed several well-characterized cell lines for various variants specific to many genetic conditions, and that several commercial vendors and professional organizations (eg, CAP) incorporate cell lines into the reference and/or PT materials they offer for NGS.

However, a recurring theme in clinical NGS testing is that bioinformatics pipelines are not standardized across laboratories. Some clinical laboratories use software supplied by platform manufacturers (which may or may not have been locally modified to improve performance), others use bioinformatics pipelines licensed from software vendors, and others rely on software packages developed in-house. Further complicating matters is the fact that software packages optimized to detect one class of variants in routine clinical use are not necessarily optimized for clinical laboratory use to detect other classes of variants,2024  and that there are differences between optimized pipelines for constitutional versus somatic analysis.9,25  Traditional analyte-specific and MBPT paradigms for NGS do not comprehensively evaluate bioinformatics pipelines because of the expense and difficulty involved in creating a full spectrum of mutations and range of VAFs in the PT challenge materials. On the other hand, although ISPT comprehensively addresses bioinformatics pipelines,4,26,27  it is limited to this component of NGS testing and thus is an approach to augment traditional analyte-specific and MBPT programs rather than replace them. In this context it is worth mentioning that the ISPT model presented here has a number of clear synergies with the recently launched PrecisionFDA Web portal,28  including the opportunity to use simulated data files with the tools contained in the Web environment to optimize bioinformatics pipelines.

It is important to note that the results we report from our study of a model ISPT have some limitations. Although the results from participating laboratories indicate that the logistics of the approach are straightforward, all the participating laboratories were affiliated with academic medical centers with considerable experience with NGS and in-house expertise in managing the file-sharing protocols among Web sites and bioinformatics tools that are intrinsic to ISPT. This level of experience and expertise may not be widely shared among all clinical NGS laboratories, which may complicate broad implementation of ISPT. Similarly, although the high accuracy of variant identification and VAF estimation by the academic laboratories in this study is reassuring, participation in ISPT by a broader range of clinical NGS laboratories may uncover quality issues that are not apparent in this feasibility study. Another limitation of our study is that it only addressed laboratories performing an amplification-based “hot-spot” assay using a commercial kit (and only 2 commercial kits were modeled). Many clinical NGS laboratories perform amplification-based tests that were developed internally that target different genes and variants, and many clinical NGS laboratories perform hybrid capture–based tests. Clearly, for ISPT to have wide utility, the paradigm must be applicable to a much broader range of NGS assays, and, to address this, we are pursuing a feasibility assessment of a per-laboratory–customized ISPT.

The mutagenesis method presented in this study does not use simulated read data. Instead, actual sequence files from NGS of a well-characterized specimen are manipulated by the MutationMaker algorithm to introduce relevant sequence variants into the sequence files. Our approach retains the heterogeneity of sequence reads that is intrinsic to data from biologic specimens (eg, distribution of quality scores for individual bases within and between individual sequence reads and local sequence contexts; distribution of depth of coverage across a target region). However, the ISPT approach presents its own technical and logistical issues. First, when platform vendors introduce new sequence file types, reference DNA samples must be resequenced to produce the files for in silico mutagenesis. Second, as clinical NGS laboratories increasingly rely on vendor-supplied or licensed bioinformatics pipelines, they may not have the in-house expertise to manage the file-sharing protocols intrinsic to ISPT challenges and may require technical support to participate in ISPT.

In conclusion, the results of our model system indicate that ISPT is a feasible approach to create sequence files containing mixtures of variants that mimic the complexity of clinical samples, and that these simulated sequence files can be used as a type of MBPT to challenge bioinformatics pipelines of amplification-based NGS of oncology specimens. Our results suggest that ISPT is likely to be useful in a broader range of NGS assay designs, including hybrid capture–based tests as well as amplification-based tests. Our results also suggest that ISPT can be used to create simulated files that can be used in MBPT for a broader range of NGS tests, including tests designed to detect germline as well as somatically acquired variants, mitochondrial as well as nuclear variants, and so forth.

First challenge participating laboratories: Julia A. Bridge, MD, University of Nebraska Medical Center, Omaha; Suzanne Kamel-Reid, PhD, University Health Network, Toronto, Canada; Alexander J. Lazar, MD, PhD, and Keyur P. Patel, MD, PhD, University of Texas MD Anderson Cancer Center, Houston; Iris Schrijver, MD, Stanford University, Stanford, California. Second challenge participating laboratories: Anonymous.

1
Gargis
AS,
Kalman
L,
Berry
MW,
et al.
Assuring the quality of next-generation sequencing in clinical laboratory practice
.
Nat Biotechnol
.
2012
;
30
(
11
):
1033
1036
.
2
Clinical Laboratory Improvement Amendments of 1988, 42 USC §201
(
1988
).
3
US Department of Health and Human Services: Clinical Laboratory Improvement Amendments of 1988; final rules and notice
.
42 CFR §493
.
Fed Reg
.
1992
;
57
:
7188
7288
.
4
Schrijver
I,
Aziz
N,
Jennings
LJ,
Richards
CS,
Voelkerding
KV,
Weck
KE.
Methods-based proficiency testing in molecular genetic pathology
.
J Mol Diagn
.
2014
;
16
(
3
):
283
287
.
5
Lanza
AM,
Dyess
TJ,
Alper
HS.
Using the Cre/lox system for targeted integration into the human genome: loxFAS-loxP pairing and delayed introduction of Cre DNA improve gene swapping efficiency
.
Biotechnol J
.
2012
;
7
(
7
):
898
908
.
6
Joung
Jk
,
Sander
JD
.
Innovation: TALENs: a widely applicable technology for targeted genome editing
.
Nat Rev Mol Cell Biol
.
2013
;
14
(
1
):
49
55
.
7
Urnov
FD,
Rebar
EJ,
Holmes
MC,
Zhang
HS,
Gregory
PD.
Genome editing with engineered zinc finger nucleases
.
Nat Rev Genet
.
2010
;
11
(
9
):
636
646
.
8
McKenna
A,
Hanna
M,
Banks
E,
et al.
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
.
Genome Res
.
2010
;
20
(
9
):
1297
1303
.
9
DePristo
MA,
Banks
E,
Poplin
R,
et al.
A framework for variation discovery and genotyping using next-generation DNA sequencing data
.
Nat Genet
.
2011
;
43
(
5
):
491
498
.
10
University of California Santa Cruz Genome Bioinformatics
.
Sequence and annotation downloads
.
April
24
2016
.
11
The Broad Institute
.
Picard
. ,
2016
.
12
Palomaki
GE,
Richards
CE.
Assessing the analytic validity of molecular testing for Huntington disease using data from an external proficiency testing survey
.
Genet Med
.
2012
;
14
(
1
):
69
75
.
13
Weck
KE,
Zehnbauer
B,
Datto
M,
Schrijver
I.
Molecular genetic testing for fragile X syndrome: laboratory performance on the College of American Pathologists proficiency surveys (2001–2009)
.
Genet Med
.
2012
;
14
(
3
):
306
312
.
14
Feldman
GL,
Schrijver
I,
Lyon
E,
Palomaki
GE.
Results of the College of American Pathology/American College of Medical Genetics and Genomics external proficiency testing from 2006 to 2013 for three conditions prevalent in the Ashkenazi Jewish population
.
Genet Med
.
2014
;
16
(
9
):
695
702
.
15
Hudson
KL,
Murphy
JA,
Kaufman
DJ,
Javitt
GH,
Katsanis
SH,
Scott
J.
Oversight of US genetic testing laboratories
.
Nat Biotechnol
.
2006
;
24
(
9
):
1083
1090
.
16
Richards
CS,
Palomaki
GE,
Lacbawan
FL,
Lyon
E,
Feldman
GL.
Three-year experience of a CAP/ACMG methods-based external proficiency testing program for laboratories offering DNA sequencing for rare inherited disorders
.
Genet Med
.
2014
;
16
(
1
):
25
32
.
17
Zook
JM,
Samarov
D,
McDaniel
J,
Sen
SK,
Salit
M.
Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing
.
PLoS One
.
2012
;
7
(
7
):
e41356
. doi:.
18
Centers for Disease Control and Prevention
.
Clinical Laboratory Improvement Amendments (CLIA): Genetic Testing Reference Materials Coordination Program (GeT-RM)—home
. ,
2016
.
19
National Institute of Standards and Technology
.
National Measurement Laboratory: standard reference materials, SRM order request summary
. ,
2016
.
20
Mardis
ER.
The $1,000 genome, the $100,000 analysis?
Genome Med
.
2010
;
2
(
11
):
84
.
21
Pritchard
CC,
Salipante
SJ,
Koehler
K,
et al.
Validation and implementation of targeted capture and sequencing for the detection of actionable mutation, copy number variation, and gene rearrangement in clinical cancer specimens
.
J Mol Diagn
.
2014
;
16
(
1
):
56
67
.
22
Spencer
DH,
Tyagi
M,
Vallania
F,
et al.
Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data
.
J Mol Diagn
.
2014
;
16
(
1
):
75
88
.
23
Spencer
DH,
Abel
HJ,
Lockwood
CM,
et al.
Detection of FLT3 internal tandem duplication in targeted, short-read-length, next-generation sequencing data
.
J Mol Diagn
.
2013
;
15
(
1
):
81
93
.
24
Sharma
MK,
Phillips
J,
Agarwal
S,
et al.
Clinical genomicist workstation
.
AMIA Jt Summits Transl Sci Proc
.
2013
;
2013
:
156
157
.
eCollection 2013
.
25
Li
H,
Handsaker
B,
Wysoker
A,
et al.
The Sequence Alignment/Map format and SAM-tools
.
Bioinformatics
.
2009
;
25
(
16
):
2078
2079
.
26
Kalman
LV,
Lubin
IM,
Barker
S,
et al.
Current landscape and new paradigms of proficiency testing and external quality assessment for molecular genetics
.
Arch Pathol Lab Med
.
2013
;
137
(
7
):
983
988
.
27
Frampton
M,
Houlston
R.
Generation of artificial FASTQ files to evaluate the performance of next generation sequencing pipelines
.
PLoS One
.
2012
;
7
(
11
):
e49110
. doi:.
28
Food and Drug Administration
.
PrecisionFDA
.
https://precision.fda.gov/. Accessed April 24
,
2016
.

Author notes

Drs Duncavage, Abel, and Pfeifer are cofounders of P&V Licensing LLC. The College of American Pathologists contracts with P&V Licensing LLC for in silico mutagenesis of NGS sequence files. The other authors have no relevant financial interest in the products or companies described in this article.