Next-generation sequencing (NGS)–based assays are used for diagnosis of diverse inherited disorders. Limited data are available pertaining to interlaboratory analytical performance of these assays.
To report on the College of American Pathologists (CAP) NGS Germline Program, which is methods based, and explore the evolution in laboratory testing practices.
Results from the NGS Germline Program from 2016–2020 were analyzed for interlaboratory analytical performance. Self-reported laboratory testing practices were also evaluated.
From 2016–2020, a total of 297 laboratories participated in at least 1 program mailing. Of the 289 laboratories that provided information on tests offered, 138 (47.8%) offered only panel testing throughout their enrollment, while 35 (12.1%) offered panels and exome testing, 30 (10.4%) offered only exomes, 9 (3.1%) offered only genomes, and 15 (5.2%) offered panels, exomes, and genomes. The remainder (62 laboratories, 21.4%) changed their test offerings during the 2016–2020 timeframe. Considering each genomic position/interval, the median detection percentage at variant positions across the 2016–2020 mailings ranged from 94.3% to 100%, while at reference positions (no variant detected), the median correct response percentage was 100% across all mailings. When considering performance of individual laboratories, 89.5% (136 of 152) to 98.0% (149 of 152) of laboratories successfully met the detection threshold (≥90% of the variants present), while 94.6% (87 of 92) to 100% (163 of 163) of laboratories met the 95% specificity threshold across mailings.
Since the inception of this program, laboratories have consistently performed well. The median sensitivity and specificity of detection of sequence variants included in this program (eg, single nucleotide variants, insertions, and deletions) were 100.0%.
Next-generation sequencing (NGS)–based tests are of higher complexity than most traditional molecular tests owing to the interrogation of many positions simultaneously, as well as requirements for extensive laboratory wet bench work and computational data analysis (ie, bioinformatics). Currently, most NGS-based tests are performed as laboratory-developed tests and are regulated in the United States by the Clinical Laboratory Improvement Act (CLIA) program under the Centers for Medicare & Medicaid Services (CMS). Clinical laboratories that obtain CLIA certification are required to demonstrate the accuracy and reliability of clinical tests they perform. Participation in proficiency testing (PT) that is administered by an external agency is one mechanism by which clinical laboratories can demonstrate their test performance. The College of American Pathologists (CAP) has deemed status from CMS to serve as a provider for PT.
Traditional molecular PT programs for human germline genetic variants have assessed laboratory performance in determining the presence or absence of well-known targeted pathogenic variants in single genes associated with inherited diseases.1–5 The design of these PT programs does not readily extrapolate to sequencing assays that are intended to detect common and rare variants, particularly NGS assays that may include many genes in a single test. Developing PT for NGS is further complicated by the diversity of sequencing platforms, chemistries for sample preparation, bioinformatics data analysis pipelines, and tests being offered by clinical laboratories, which range from limited gene panels to exomes and genomes. In addition, comparison of PT performance across laboratories requires testing of the same sample by all participants for each challenge; however, the requirement for broad consent for use and distribution of such samples to mitigate against the potential risk of loss of confidentiality limits the availability of suitable samples with appropriate variants for an effective PT challenge. Together, these complexities require a novel approach to designing a PT program for NGS-based testing.
The CAP NGS Project Team developed a methods-based PT program for NGS detection of germline variants. This approach allows assessment of the overall analytical process, from sample preparation through sequencing to bioinformatics analysis and germline variant identification, irrespective of the individual laboratory analytical workflow used to perform the test.6 As a strictly technical challenge, this methods-based program can potentially be used by laboratories performing genetic testing for common disorders, as well as rare disorders for which only 1 or a few laboratories may offer testing. As a methods-based approach, variants can be included across the full range of NGS tests offered by every laboratory that subscribes to the NGS PT; however, it is not possible to include clinically significant variants in each gene in each mailing. Combined with the need for a renewable, uniform sample source that can be provided to a large number of laboratories, the CAP NGS Project Team opted to use cell lines well characterized by the Genome in a Bottle Consortium or the Personal Genome Project.7,8 To supplement the limited number of variant types present in these cell lines, and to provide classification and nomenclature challenges that are not amenable to the standardized result form (RF), “dry challenges” have been incorporated into this NGS Germline Program. In addition, both in silico and wet sample–based complementary NGS surveys have been designed for commonly tested indications, such as inherited cancer and cardiomyopathy (ICSP and CMSP, respectively). These include pathogenic variants frequently encountered in these disorders and challenge both variant detection and classification.
The NGS Germline PT Program began as a pilot involving a limited number of laboratories in 2014, followed by a formal PT launch to a broader number of laboratories as an educational program in 2015, and ultimately as a graded PT program in 2016, which has been managed by the Biochemical and Molecular Genetics Committee since 2018. In this report, we describe the NGS Germline Program and summarize results from 2016–2020. The results presented are the first formal evaluation of clinical laboratory NGS-based testing for germline variants under the auspices of the CAP Laboratory Quality Solutions Program.
MATERIALS AND METHODS
This analysis represents a 5-year retrospective summary of practices and performance for the NGS Germline Program. Laboratories enrolled in the NGS Germline Program received 2 mailings (designated A and B mailings, respectively) per year. Data from 2016–2020 were included, for a total of 10 mailings. Each mailing consisted of 10 µg of isolated DNA. Laboratories were provided kit instructions that directed laboratories to perform their NGS assay (eg, targeted gene panels, exome, or genome) according to their established laboratory procedure, as well as information on the sex for each sample and additional details as described below. While in clinical practice, laboratories may confirm all or a subset of reported NGS Germline results by Sanger sequencing, the kit instructions indicated that confirmatory testing was optional for this program to reduce the cost of PT participation.
Three different cell lines were surveyed between 2016 and 2020. These cell lines were expanded and sequestered at the Coriell Institute for Medical Research (Camden, New Jersey) for CAP use only. Each cell line was subjected to genome and exome sequencing using different methods. Genome sequencing was performed by using Complete Genomics technology (Illumina, San Diego, California), while exome capture was performed by using Life Technologies TargetSeq (Thermo Fisher, Waltham, Massachusetts), Nimblegen SeqCap (Roche, Basel, Switzerland), or SureSelect (Agilent, Santa Clara, California) target capture kits. Sequencing and variant calling was performed by contracted vendors including ARUP Laboratories (Salt Lake City, Utah), Complete Genomics (San Jose, California), Emory University (Atlanta, Georgia), Illumina (San Diego, California), and Life Technologies (Carlsbad, California). See the Supplemental Methods for details (see supplemental digital content at https://meridian.allenpress.com/aplm in the July 2024 table of contents). The final material provided to the CAP consisted of Genome Reference Consortium human genome build 37 (GRCh37)–aligned BAM (binary alignment map) files, and genome or exome-wide single nucleotide variant (SNV) and short (<50 bp) insertion, deletion, and delins calls in variant call format (VCF) files. For the 2020 mailings, the genome and exome data generated on the Illumina NGS platform were realigned and variants re-called by using the hs37d5 reference genome (build 37 + decoy)9 and GATK (Genome Analysis Toolkit) best practices workflow for SNV and short insertion, deletion, and delins calling. This realignment allowed for more accurate mapping of pseudogenes and variant identification at loci with multiple mappings.
A list of variants for potential inclusion was created by identifying variant positions that were concordant between the genome and exome data sets. Additional criteria for position/interval inclusion evolved over the years to include a minimum of 20× sequence coverage of the position or interval, absence of read strand bias, avoidance of homopolymer regions, and limitation to positions within 10 bp of the coding region.
Gene and Variant Selection for Proficiency Testing
To operationalize the NGS Germline Program within CAP's existing PT data–handling infrastructure, each mailing was limited to 200 chromosomal positions or intervals. Gene selection was based on expert input by members of the NGS Project Team, coupled with review of gene content from NGS-based germline gene panels that was collected from participating laboratories as part of the program. The genes selected were associated with a range of inherited disorders (eg, cardiomyopathies, hearing loss, germline cancer predisposition syndromes) and varied among mailings. From 2016–2020, of the 424 total genes included, 232 (54.7%) were included in 4 to 6 mailings, 99 (23.3%) were included in 2 to 3 mailings, 65 (15.3%) were included in 7 to 9 mailings, 24 (5.7%) were included in only 1 mailing, and 4 genes (AP3B1, HRAS, KRAS, and RET) were included in all 10 mailings (see Supplemental Table 1 for a full list of genes included). A total of 200 genes, and respective chromosomal positions or intervals, were selected for each survey to (1) ensure that a sufficient number of genes were available to allow laboratories performing targeted panels from diverse clinical indications to participate, and (2) allow feasibility of administering the survey using the available CAP reporting and scoring infrastructure. Each chromosomal position or interval selected represented either reference or variant allele sequences (SNVs, small insertion and deletions, and delins) selected from the concordant variant data set. Positions were selected by location within coding regions ± 10 bases of exon/intron junctions. Each candidate variant was visually reviewed in Integrative Genomics Viewer (IGV) as a quality measure. For consistency, each candidate variant position was annotated by Alamut Visual (Sophia Genetics, Boston, Massachusetts) using the current National Center for Biotechnology Information (NCBI) reference sequence (RefSeq) transcript listed in the Human Gene Mutation Database (HGMD) for the selected gene.10
Survey Logistics
Laboratories were directed to enter testing results into the RF, which was provided as an electronic PDF populated with the 200 genes and their associated chromosomal positions or intervals. Chromosomal positions or intervals for each gene on the RF were based on the GRCh37/hg19 reference sequence. Laboratories performing multigene panel testing were requested to provide results for only those genes and chromosomal positions or intervals included in their panels. Laboratories performing exome or genome sequencing were required to respond to a CAP-designated set of 50 of the 200 genes per mailing, with the assumption that their individual assays interrogated all of the 50 designated positions. CAP provided a web-based list of all genes included in each mailing so that laboratories performing panel testing could determine if the gene content matched their testing menus before program enrollment decisions. Results were due approximately 80 days after sample mailing.
Program Grading and Evaluations
For the graded portions of the program, which began with the 2016-A mailing, laboratories were asked to determine whether a variant was present at a given genomic position or within a genomic interval, and to indicate variant zygosity. Additionally, laboratories were instructed to describe variants at the cDNA (complementary DNA) and protein levels by applying Human Genome Variation Society (HGVS) consensus nomenclature (http://www.HGVS.org/varnomen) as described in the kit instructions and to report variant descriptions based on the provided NCBI RefSeq transcripts. The nomenclature component was initially ungraded but transitioned to a graded component in the 2020 NGS-A mailing. On the program evaluations, the nomenclature component is referred to as the variant description. Laboratories were allowed to submit variant nomenclature based on a transcript other than that indicated on the RF but were instructed to indicate the transcript they used. Variant assessment is currently evaluated from 3 components: variant identification, zygosity assignment, and variant description (Figure 1, A through C). Of note, the variant assessment for the program is different from that used in this study (described in the Data Analysis section below). An acceptable evaluation for a variant in this program requires that the variant identification and zygosity evaluations be acceptable and that the variant description (nomenclature) evaluation be preferred or acceptable. Preferred indicates that the nomenclature used is what is recommended by HGVS, as well as complies with the recommendations and requirements for reporting as spelled out in the kit materials. Acceptable indicates that the nomenclature is technically correct when using the HGVS standards, but typically is from an older version (eg, p.Cys363Cys instead of p.Cys363 = for a synonymous change). In addition to the variant assessment measure, a specificity approach was also used (Specificity = True Negatives/[True Negatives + False Positives] × 100). To receive an overall evaluation summary for any given survey, at least 5 variant and 5 reference positions needed to be reported (Figure 1, A through C). For an overall “Good” performance grade, laboratories must score 80% or more on the correct variant assessment component and 95% or more on the specificity component. These thresholds were selected owing to manual data entry and potential for transcription errors for the variant descriptions coupled with a limited number of positions, both reference and variant, available in the survey, particularly for laboratories using this program for small panels. It should be noted that in routine clinical practice, laboratories establish much more stringent criteria for their tests. If a laboratory’s assay did not interrogate a given chromosomal position or interval, they were asked to leave all response elements blank. If their assay did interrogate a gene and position or interval, but they were unable to provide a definitive evaluation (ie, low-quality sequence data), they were asked to indicate why they could not evaluate the position/interval. Because the materials used for this program were extensively characterized, grading was based on the intended response.
Survey grading and evaluation. (A) Evaluation workflow for the next-generation sequencing survey (with references to the tables in [B]), which includes first evaluating the individual chromosomal positions, followed by correct variant assessment (for variant positions) and specificity calculation (for reference positions), and finally combining the 2 measures into an overall evaluation. (B) Tables labeled A through D provide the details used to perform each evaluation component in (A). (C) The “Evaluation Summary” provided to each participant summarizes overall performance. Abbreviations: CAP, College of American Pathologists; SNV, single nucleotide variant.
Survey grading and evaluation. (A) Evaluation workflow for the next-generation sequencing survey (with references to the tables in [B]), which includes first evaluating the individual chromosomal positions, followed by correct variant assessment (for variant positions) and specificity calculation (for reference positions), and finally combining the 2 measures into an overall evaluation. (B) Tables labeled A through D provide the details used to perform each evaluation component in (A). (C) The “Evaluation Summary” provided to each participant summarizes overall performance. Abbreviations: CAP, College of American Pathologists; SNV, single nucleotide variant.
Data Analysis
For this study, the NGS Program results for 10 mailings from 2016–2020 were analyzed to evaluate both position-specific and laboratory performance. Eighteen chromosomal positions were excluded from the performance measures since these results were not evaluated in the NGS Program mailings. These exclusions were for the following genes: 2016-A: BBS2, BBS4, CYP1B1, FBN1, LAMA2, PMS2, and SGCG; 2016-B: NEFL and PRKDC; 2017-B: RET; 2018-B: COL5A2; 2019-A: RSS1; 2019-B: DMD, DSP, and FBN2; 2020-A: GAA, LTBP4, and PRKAR1A.
For purposes of PT, participants were graded on the combined variant assessment (variant identification; zygosity assignment; and beginning in 2020, variant description). In contrast, for this study, only the variant identification (detection) status was used to evaluate laboratory performance of variant assessment because the variant description was not graded for the 2016–2019 mailings. The correct detection status for each position was based on the position status—variant or reference. For reference positions, a correct response was no variant detected. The individual chromosomal position detection statuses were used to create laboratory-specific sensitivity (ie, a variant is present and the laboratory detected it) and specificity (ie, for reference positions, the laboratory reported “no variant detected”) measures. Both measures required that the laboratory provide at least 5 results for each measure. Overall compliance-based thresholds of 90% for sensitivity (which differs from the 80% used for survey grading for variant assessment; a more stringent threshold is used here to account for excluding zygosity and variant description) and 95% for specificity were calculated for each mailing. Throughout the article, detection status (eg, detection rate for variant positions or correct response rate for reference positions) refers to the aggregate performance of all laboratories at a specific position, while sensitivity and specificity refer to individual laboratory performance across variants.
Additional testing characteristics for the participation cohort were summarized. Platform and assay type were based on the laboratory self-reported responses collected in each mailing. The number of positions (or intervals) tested by each laboratory per mailing was calculated from the reported position results, and the institution type summary was based on the classification extracted from CAP’s demographic database. Missing classifications were input from web search results of the institution’s organizational profile page. Fisher exact tests were used to evaluate mailing-specific sensitivity and specificity compliance rate differences between US and international laboratories. A significance level of .05 was used for this testing.
All analyses were performed with SAS 9.4 (SAS Institute, Cary, North Carolina).
RESULTS
Participating Laboratories
Two hundred ninety-seven laboratories submitted results for at least one of the mailings (Figure 2). A little more than half of the participating laboratories (56.6%; 168) were in the United States. Twenty-two laboratories (7.4%) were in Canada, and 20 (6.7%) were in China. The remaining 87 laboratories were from 29 different countries. In terms of practice setting, 208 of the 297 participating laboratories provided information, while classifications for the remaining laboratories were input from internet search results of each institution’s organizational profile page (Figure 2). Of these, more than half (59.3%; 176) were classified as independent/commercial reference laboratories. An additional third (31.3%; 93) were classified as academic hospital/medical center laboratories. The remaining practice settings were nonacademic hospital/medical center laboratories (7.7%; 23) and non-hospital/clinic laboratories (1.7%; 5).
Institution and practice setting. Laboratory practice setting is provided and categorized on the basis of whether the laboratory is based in the United States (domestic, blue bars) or is international (gray bars). The percentage of participating laboratories in each category is provided to the right of the bar.
Institution and practice setting. Laboratory practice setting is provided and categorized on the basis of whether the laboratory is based in the United States (domestic, blue bars) or is international (gray bars). The percentage of participating laboratories in each category is provided to the right of the bar.
Based on questions laboratories responded to in the most recent mailing that they participated in through 2020, the majority of 289 laboratories (144, 49.8%) offered gene panels only, but some laboratories offered exomes only, genomes only, or various combinations of the 3 (Table 1). A variety of platforms were used by 291 laboratories, with the Illumina MiSeq being the most common (128, 44.0%). Based on responses to questions in the first and most recent survey each laboratory participated in, the types of assays individual laboratories offered changed over time (Table 2). Among the 153 laboratories that initially offered only panels, 10 expanded to also offer exomes, 3 to also offer exomes and genomes, and 2 transitioned from only panels to only exomes. Most laboratories did not change the types of assays that they offered during this period (Table 2).
Survey Content and Performance
A total of 424 genes were included across the 10 mailings. A list of the individual genes included, along with the number of mailings that included the gene and the number of participants testing the gene, is shown in Supplemental Table 1. Each mailing included 200 chromosomal positions or intervals that laboratories could evaluate. Each position or interval could be reference, an SNV, deletion, duplication, or insertion. The breakdown for each mailing is visualized in Figure 3. In each mailing, most positions were either reference or SNV, along with at least 1 and up to 4 deletions, duplications, and/or insertions. Of note, until 2019, duplications and insertions were categorized together as insertions.
Frequency of chromosomal position types included in NGS Program mailings. The number of reference positions (black circles) or variant positions (gray circles) in each mailing, as indicated on the vertical axis, is provided. Variant positions are categorized as SNV, deletion, duplication, or insertion. Abbreviations: NGS, next-generation sequencing; SNV, single nucleotide variant.
Frequency of chromosomal position types included in NGS Program mailings. The number of reference positions (black circles) or variant positions (gray circles) in each mailing, as indicated on the vertical axis, is provided. Variant positions are categorized as SNV, deletion, duplication, or insertion. Abbreviations: NGS, next-generation sequencing; SNV, single nucleotide variant.
The performance of the individual positions by position type is summarized in Table 3. In the 2016-A mailing, the range of detection rates for the 53 positions with variants was 83.6% to 100%. For the last mailing included in this study (2020-B), the range of detection rates for the 97 positions with variants was narrower, at 92.3% to 100%. Six of the 20 mailings had variants with detection rates less than 90% (Table 4). The performance for the individual reference positions was better; there were no reference positions with correct response rates less than 90%. Reference position performance was most variable for the first 2016-A mailing, where the correct response (true negative) rates for the 140 reference positions ranged from 90.5% to 100%. There were no reference positions with correct response rates less than 90%.
The positions with the lowest detection rates (<90%) are shown in Table 4. Of note, 6 of the 13 positions with low detection rates were in the first mailing (2016-A) when this program was new to laboratories, which may reflect unfamiliarity with the mechanics of the program, including data review and reporting, rather than lack of detection by some participants that missed these variants. One variant in DSP with a detection rate less than 90% was included in 2 consecutive mailings. Several other positions (TPM2, EYS, AP3B1, and ANO5) with a low detection rate were intronic variants in homopolymer regions. For example, the ANO5 variant (c.364-8del; also known as c.364-8delT in previous HGVS versions) was in a homopolymer region, which includes a stretch of 8 T nucleotides from −15 to −8. Given that most bioinformatic pipelines shift this variant to the −15 position, and the variant was in a homopolymer region, it is not known whether laboratories did not detect the variant or considered it outside of their reporting range (often ±10 bp from the exon boundaries). For a detailed description of the types of variants that were challenging to laboratories and potential solutions to improve performance, see Nardi et al.13
The laboratory-specific performance measures are summarized in Table 5. Since the initial mailing, the number of participating laboratories graded for sensitivity assessment gradually increased, from 70 in 2016-A to 149 in 2020-B, and as high as 166 in 2019-B. When evaluating the sensitivity performance of laboratories using the cutoff of 90% for this article, the percentage of laboratories meeting this threshold (eg, successfully detecting ≥90% of the variants present) ranged from 89.5% to 98.0%. The number of participating laboratories graded for specificity assessment also gradually increased over the years, from 92 in 2016-A to 149 in 2020-B, and as high as 163 in 2019-B. The cutoff of 95% was used to assess the laboratory performance on specificity for both this article and the program itself, and 94.6% to 100% of laboratories met this threshold (eg, successfully reported that no variant was detected at ≥95% of the reference positions). No statistically significant performance differences in sensitivity or specificity were identified between US-based and international laboratories across the mailings.
The total number of laboratories participating in the NGS Germline PT in each of the 10 mailings and an overview of the number of positions tested are shown in Table 6. The number of laboratories participating in this PT program during the interval included in this analysis ranged from a minimum of 116 in the initial 2016-A mailing to a peak number of 171 laboratories in the 2018-B and 2019-B mailings. The median and 10th percentile of the number of positions tested also increased from 31 to 50 positions and 3 to 13 positions, respectively. Except for the first mailing, the 90th percentile of the number of positions tested has been in a tight range (197–200).
The 2020-A and 2020-B mailings were the first to include educational “dry” challenges that were developed to supplement the graded analytical part of this program. For both mailings, the challenge consisted of an IGV screenshot, with a question or questions related to interpreting the result, and multiple-choice responses (Supplemental Figures 1 and 2). For the 2020-A mailing, an IGV screenshot of an intronic canonical splice variant at a +1 position was given, and participants were asked to select the correct HGVS nomenclature and classify the pathogenicity of the variant. Participants performed well on this challenge, with 98.1% (153 of 156) selecting the correct nomenclature (c.993+1delG) and 100% correctly classifying the variant as pathogenic. For the 2020-B mailing, an IGV screenshot was provided in which a single delins was represented as 2 variants by the variant calling pipeline. For this challenge, 83.7% (118 of 141) of participants recognized that there was only a single variant, and 80.8% (114) selected the correct HGVS nomenclature (c.624delinsGGGGGGA, p.Asp208delinsGluGlyGly).
DISCUSSION
In this report, we describe the evolution of the first CAP NGS Germline PT Program, which is methods based, and the performance on this program for laboratories analyzing germline variants by NGS. This program was intended to provide an external technical assessment of the detection of SNV and small insertions, duplications, deletions, and delins for laboratories performing gene panels, exome sequencing, or genome sequencing for a variety of genes associated with inherited diseases. For each mailing, 200 genes and chromosomal positions or intervals were included to provide a large enough set of genes to adequately interrogate the diverse set of panels offered by laboratories, while remaining manageable to process and review within the CAP PT informatics infrastructure.
Survey Performance
The results indicate that most laboratories correctly identified variants when present. In each mailing, only a minority of SNV positions were not detected by responding laboratories. Further analysis of these positions, as described by Nardi et al,13 provided insights into genomic regions that were more difficult for laboratories to evaluate, which included regions with high GC content, homopolymer regions, and regions with pseudogene interference.
Grading of nomenclature responses in this program began in 2020. Unacceptable nomenclature responses were due to a variety of factors, including inaccuracies in annotation software output, incorrect HGVS nomenclature guideline interpretation by participants, failure to designate an alternative transcript from the one listed on the RF, or transcription errors. An in-depth discussion with specific examples of the most common annotation difficulties encountered in both germline and somatic CAP NGS PT surveys has been published previously.13
To assess performance of additional aspects of NGS testing that are not amenable to the RF that is used to collect responses for identification of variants and nomenclature, dry challenges were introduced into this program beginning in 2020. The 2 dry challenges incorporated in the timeframe studied here indicate that participants can recognize and correctly apply HGVS nomenclature to SNVs visualized in IGV, but resolving the structure and nomenclature of more complex variants, such as deletions and insertions, poses greater difficulty. This difficulty has been noted previously.13
Survey Limitations
The primary purpose of this program has been to assess analytic NGS performance for detection of sequence variants rather than interpretation. Another drawback is the use of normal cell lines for this program, which limits the diversity and complexity of variants that can be interrogated. For one, most of the variants in the available cell lines for this program are benign. A negative consequence of querying primarily benign variants is that participants may have had to adjust their analysis to identify these variants, which are frequently filtered out by bioinformatics pipelines. Alternative approaches, such as manual inspection of the BAM file in a genome browser or adjustment of bioinformatics filters, may have been required. While visualization of sequence results may be necessary to differentiate artifacts from real variants or to verify the correct annotation of a variant, the large number of benign variants in this program and manual analysis may have resulted in errors that otherwise would not occur. Another limitation arising from the use of normal cell lines is the predominance of reference positions and single nucleotide substitutions that are assessed as compared to single nucleotide deletions, duplications, insertions and more complex variants. In addition, this program specifically avoided variants located in complex regions of the genome, such as regions of homology and repetitive regions. This limitation reflects the challenges of short-read NGS chemistry. Most laboratories that include these complex regions in their tests also perform confirmatory testing, particularly in the setting of homology and pseudogenes. This was considered out-of-scope of this NGS methods-based PT program, as other methodologies are typically required.
To address complex variants, variants located in complex regions of the genome, and variant classification/interpretation, the Biochemical and Molecular Genetics Committee has added dry challenges to this program, as mentioned above. In addition, CAP offers and recommends separate CAP NGS PT surveys for common germline disorders, such as inherited predisposition to cancer (ICSP) and cardiomyopathies (CMSP) that include relevant pathogenic variants. These additional programs also evaluate the ability of laboratories to correctly classify the variants detected. Furthermore, as laboratories transition from targeted genotyping for specific pathogenic variants to NGS, they frequently continue to subscribe to the CAP Molecular Genetics Series programs (MGL modules) for inherited disorders that were originally designed for targeted genotyping but are compatible with NGS. These programs include known pathogenic variants and require participants to interpret the clinical significance of their results. Finally, in silico PT programs have been recently developed for undiagnosed disorders (NGSE for probands and NGSET for trio analysis) and require variant classification and clinical interpretation.
At the time this program was launched, many laboratories were using non-NGS techniques for copy number variation (CNV) detection; therefore, CNVs were out-of-scope and not included. Other types of structural variation detection have also not been addressed by this program, which was developed for laboratories offering small panels in addition to exome or genome sequencing. As more participants move toward genome sequencing, PT that challenges the detection of structural variation will need to be developed.
Future Directions
Analysis of the NGS Germline Program performance over several years demonstrated that the number of required positions for laboratories performing exome or genome sequencing could be reduced from 50 to 25 without impacting grading assessment. Therefore, this change was implemented in 2021 to decrease the frequency of transcription errors and the time involved in completing each mailing. The option to report results based on genome assembly GRCh38/hg38 was allowed beginning with the 2021-A mailing. While transitioning the entire program to hg38 is under consideration, recent data obtained from questions included in the NGS 2022-A mailing indicated that among the 177 laboratories that responded, 79.7% are currently using GRCh37/hg19 as their reference genome, while 20.3% have transitioned to hg38.14 The percentage of laboratories using hg38 was lower among those participating in the 2022-A Next-Generation Sequencing–Hematologic Malignancies (NGSHM) (n = 170; 5.9% using hg38 and 1.8% using both hg19 and hg38 within their laboratories; however, for this program the RF is only set up for hg19) and Next-Generation Sequencing–Solid Tumor (NGSST) programs (n = 271; 6.6% using hg38 and 3.0% using a combination of hg19 and hg38).15,16 Of note, in recent NGS Germline mailings, an issue was identified with several genes that are falsely duplicated in hg38, which leads to false-negative calls for KCNE1.17 Further improvements (patches) to correct issues and/or new versions of the reference sequence, such as the assembly created by the Telomere-to-Telomere Consortium, may become available.18 As such, laboratories will need to document the reference sequence used in their processes, and PT may need to remain compatible with multiple reference genomes for the foreseeable future. Finally, the current NGS Germline Program does not include CNV or other structural variant detection, which is currently assessed by other PT programs. As there are multiple methodologies in addition to NGS—such as chromosomal microarrays and optical genome mapping—that are available, development of a platform-agnostic PT assessment of detection of these types of variants is under consideration.
In conclusion, this report describes the development and implementation of the first CAP PT program specifically designed for NGS-based detection of germline sequence variants. This program represents an example of applying the concept of methods-based PT and is applicable to laboratories performing panels, exomes, and genomes.6 Nearly 50% of laboratories that subscribe to this program offer targeted panels, underscoring the need for a flexible solution that works across laboratories, including differing subsets of the genome in their tests. A high degree of acceptable laboratory performance for the detection of germline variants was earlier reported in the CAP/ACMG PT Program, based on interpretation of Sanger sequencing data, which is methods based.19 Overall, the high sensitivity and specificity of participant results for the NGS Germline Program indicate that laboratories have successfully adopted NGS technology and perform well on this PT program designed to evaluate the ability of laboratories to correctly identify the presence or absence of germline sequence variants.
References
Author notes
Supplemental digital content is available for this article at https://meridian.allenpress.com/aplm in the July 2024 table of contents.
Tsuchiya is currently located at the Institute for Genomic Medicine, Nationwide Children’s Hospital, in Columbus, Ohio. Halley is currently located in the Learning Department at the College of American Pathologists, Northfield, Illinois. Zhao is currently located at Nutrien in Loveland, Colorado.
The authors have no relevant financial interest in the products or companies described in this article.