Proficiency testing (PT) plays a crucial role in maintaining the quality of clinical laboratories, as mandated by the Clinical Laboratory Improvement Amendments (CLIA).1  Central to PT is the use of well-defined reference materials that closely mimic clinical samples.2  The College of American Pathologists (CAP) Molecular Oncology Committee (Mol Onc) oversees PT programs for next-generation sequencing (NGS) assays, which include genetically engineered cell lines and in silico mutagenized sequencing data files. These materials are designed to resemble clinical specimens encountered in routine practice by containing various genetic variants found in clinically relevant genes with a range of variant allelic frequencies (VAFs).

Also see p. 139.

In 2016, Tapestry Networks initiated the Sustainable Predictive Oncology Therapeutics and Diagnostics (SPOT/Dx) working group, bringing together diverse stakeholders, including clinical and policy experts, regulators, payers, patient advocates, and industry leaders. SPOT/Dx aimed to standardize the development of molecular oncology testing and quality assurance programs. To enhance the standardization of lab-developed NGS testing (NGS LDT), SPOT/Dx conducted a Diagnostic Quality Assurance pilot study using paraffin blocks containing human cell lines and in silico mutagenized files as reference materials for RAS testing in colon cancer—a strategy similar to that of CAP PT programs. This study, published in 2022, revealed limitations in NGS LDTs, including discordant findings between the results reported by participating laboratories and the results of US Food and Drug Administration (FDA)–approved companion diagnostic (CDx) tests.3  A major source of discordant results was subsequently attributed to the inclusion of variants in the SPOT/Dx pilot study, the mean detected VAF of which was below the limit of detection (LOD) of many participants’ NGS LDT.4  As a result, some participating laboratories did not report these variants, leading to an artifactually increased false-negative rate.

Using variants at LOD for NGS PT is a crucial aspect of ensuring the accuracy and reliability of NGS assays because it allows participating laboratories to assess the sensitivity of their NGS LDT by challenging them to identify variants present at low allelic fractions. Because clinical samples are frequently small and contain variants at low allelic frequencies, testing against PT samples with low VAF helps laboratories identify potential issues with sensitivity, assay performance, and data analysis, allowing for the improvement of processes and procedures. It also allows laboratories to benchmark their performance against others in the field, facilitating meaningful comparisons and promoting standardization.

Because a major goal of the SPOT/Dx pilot study was to determine the performance of NGS LDTs, it was designed to include challenging variants with low VAF, similar to CAP PT programs.5  However, the design of the SPOT/Dx pilot study differed from conventional PT programs in several ways, and as a result its conclusions represent an underestimation of the actual performance of the participating laboratories.4  For example, the reference materials in the SPOT/Dx pilot were designed with VAFs hovering near the LOD of NGS LDTs used by most of the participating laboratories. Although designing a PT program with VAFs near the LOD of NGS LDT provides a good opportunity for evaluating the performance of an NGS LDT, this approach creates a problem when the LOD of each NGS LDT is not taken into consideration. In this scenario, the reported variant was graded incorrectly when the VAF was not equal to or greater than the VAF listed in the package insert of the FDA-approved CDx comparative assay. Consequently, the performance of NGS LDTs was underestimated, leading to erroneous impressions regarding the quality and accuracy of these LDTs. Furthermore, labs may have used subjective criteria for reporting variants with VAF below their NGS LDT’s LOD. This could have easily led to reporting variability in the SPOT/Dx pilot study and further contributed to the underestimation of the true performance of NGS LDTs.

Another issue in the SPOT/Dx pilot study was the inclusion of 4 engineered multinucleotide variants (MNVs), which are rarely encountered in colon cancer. Detection of insertions and deletions, especially dinucleotide or trinucleotide sequence substitutions (ie, MNVs), is essential for patient management but challenging for bioinformatics pipelines. MNVs are prone to being annotated incorrectly because each nucleotide substitution within the MNV can be interpreted individually rather than as a collective whole.6  For the 4 MNVs included in the SPOT/Dx pilot, a detection rate of 81.1% for the 159 participating labs was described.3  Interestingly, the detection rate of these same MNV variants has significantly improved with time in CAP NGS PT programs.4  This reveals that CAP NGS PT programs successfully achieved one of their major stated goals: contributing to continuous improvements in laboratory performance through the implementation of enhanced bioinformatics pipelines or practices, such as manual review of BAM files using the Integrative Genomics Viewer. Despite these issues, the SPOT/Dx pilot study attempted to generalize and extrapolate the findings from its small pilot program, which focused on a small number of rare variants in 1 disease, to the overall performance of NGS LDTs for all cancer types.3  It is worth noting that MNV variants in KRAS and NRAS are exceedingly rare in real-world colon cancer cases. This raises the question of whether the assessment of an NGS LDT’s performance accurately reflects its utility in clinical practice if the reference materials used for testing are not representative of real-world cases.

Reanalyzing the data from the SPOT/Dx pilot study using conventional CAP PT methods, Zehir et al4  observed a significant improvement in the performance of participating laboratories for both SNV and MNV detection. In addition, these authors compared SPOT/Dx results to the results of historical CAP PT programs that included the same KRAS and NRAS variants. Notably, these CAP PT programs—used by hundreds of laboratories for several years under myriad conditions and platforms—are much more likely to reflect the true scope of clinical practice. This comparison demonstrated the excellent performance of NGS LDTs for identifying SNV and MNV variants in KRAS and NRAS (overall detection rate of 97.2% and 91.8% for SNVs and MNVs, respectively). Taken together, the excellent performance of NGS LDTs on CAP PT programs are much more generalizable than the results of the SPOT/Dx pilot study.

The focus of the SPOT/Dx pilot on KRAS and NRAS variants for selecting metastatic colorectal cancer patients for anti-EGFR therapy does not align with the broader trend toward large-scale, pan-tumor genomic analysis adopted by many clinical laboratories.6  Recent advances in precision medicine and genomics reveal the importance of comprehensive genomic profiling, and oncologists seek broader genetic information to inform their treatment decisions beyond a single cancer type. Many clinical trials and research initiatives now prioritize large-scale genomic profiling to identify potential participants and advance our understanding of genetic drivers of cancer.

In recent years, the FDA has been actively engaged in efforts to regulate LDTs, including NGS assays. These regulations aim to establish a framework for the oversight and quality control of LDTs to ensure their safety and effectiveness in clinical practice. The FDA recognizes the importance of harmonizing standards and ensuring that LDTs meet the same level of quality and accuracy as in vitro diagnostics that are subject to FDA clearance or approval. Historically, the FDA exercised enforcement discretion over LDTs, meaning it did not actively regulate them, but rather focused on ensuring the safety and effectiveness of commercial in vitro diagnostics products. As demonstrated by Zehir et al,4  CAP PT programs directly address and satisfy these quality concerns of the FDA by demonstrating the excellent performance of NGS LDTs.

Of late, the FDA has taken steps to increase its oversight of LDTs, presumably because of concerns about the potential risks associated with inaccurate or unreliable test results. However, clinical laboratories perform these test procedures under the direct supervision of highly trained, licensed medical professionals and are fully capable of achieving excellent performance, which is ensured through CAP PT and quality assurance programs. The CAP PT programs, for instance, have consistently demonstrated high levels of accuracy and quality in testing.7  By endorsing and promoting existing CAP PT programs, the FDA can continue to monitor and ensure the quality and accuracy of LDTs. This approach allows clinical laboratories to maintain their commitment to safeguarding patient care while upholding the highest standards of quality and accuracy without the need for additional regulatory burden.

It is important to note that the SPOT/Dx pilot, although well intentioned, can be criticized for inadvertently supporting the FDA’s position on increased regulation by underestimating NGS LDT performance. This raises concerns that alternative PT programs that do not accurately reflect the true capabilities of clinical laboratories can be used by the FDA as evidence to justify their regulation of LDTs. As such, it is critical that laboratory professionals carefully assess the intention, value, and utility of alternative PT programs before they are implemented in the clinical laboratory.

The necessity for alternative PT is a direct response to the dynamic and ever-evolving landscape of laboratory testing. Laboratories frequently encounter situations where they must analyze new or emerging substances, employ customized testing methods, or cater to specific patient groups not covered by existing PT programs. Moreover, the introduction of innovative technologies and testing approaches may not always align with the availability of traditional PT programs. For instance, the SPOT/Dx pilot study illustrates an alternative PT option for NGS LDTs, even when an equivalent CAP PT program is available. Analyzing both PT programs offers a valuable opportunity to explore and assess whether existing CAP PT programs can be enhanced in this domain. In such scenarios, laboratories must make the decision to devise alternative PT strategies to ensure the precision and reliability of their test results. For example, with the increasing prevalence of advanced forms of NGS testing like cell-free DNA analysis, there is a growing need to establish complementary or alternative PT assessment programs when an existing CAP PT is either not available or unsuitable for some reason. These alternative PT strategies can encompass a variety of approaches, including commercially sponsored studies,8  collaborative interlaboratory studies,9  and in-house validation.

Nevertheless, careful evaluation of the design of the SPOT/Dx pilot study is crucial for accurately interpreting their findings. Reanalyzing the SPOT/Dx data, Zehir et al 4  concluded that the SPOT/Dx pilot did not accurately assess performance; and in fact, the participating laboratories achieved excellent results. It is essential to stress that the performance of clinical laboratories should not be solely judged based on reference materials consisting of extreme conditions, such as an overrepresentation of MNVs, a small number of rare variants, or strictly evaluating the detection of VAFs near the NGS LDT’s LOD, because the results of this could be used inappropriately to advance federal regulation of LDTs. Although the SPOT/Dx pilot study shed light on the complexities of NGS LDTs, its reanalysis ultimately reaffirms the excellence of participating laboratories, emphasizing the importance of a holistic approach to evaluating clinical laboratory performance beyond extreme reference material conditions.

1.
Wilkinson
DA,
Wagner
EA.
Quality management in laboratory medicine
.
In
:
Wagar
EA,
Cohen
MB,
Karcher
DS,
Siegal
GP,
eds
.
Laboratory Administration for Pathologists
. 2nd ed.
Northfield, IL
:
College of American Pathologists
;
2019
:
119
136
.
2.
Zehnbauer
B,
Lofton-Day
C,
Pfeifer
J,
Shaughnessy
E,
Goh
L.
Diagnostic quality assurance pilot: a model to demonstrate comparative laboratory test performance with an oncology companion diagnostic assay
.
J Mol Diagn
.
2017
;
19
(1)
:
1
3
.
3.
Pfeifer
JD,
Loberg
R,
Lofton-Day
C,
Zehnbauer
BA.
Reference samples to compare next-generation sequencing test performance for oncology therapeutics and diagnostics
.
Am J Clin Pathol
.
2022
;
157
(4)
:
628
638
.
4.
Zehir
A,
Nardi
V,
Konnick
EQ,
et al
SPOT/Dx pilot reanalysis and College of American Pathologists proficiency testing for KRAS and NRAS demonstrate excellent laboratory performance [published online ahead of print September 30, 2023]
.
Arch Pathol Lab Med
.
5.
Merker
JD,
Devereaux
K,
Iafrate
AJ,
et al
Proficiency testing of standardized samples shows very high interlaboratory agreement for clinical next-generation sequencing-based oncology assays
.
Arch Pathol Lab Med
.
2019
;
143
(4)
:
463
471
.
6.
Harada
S,
Mackinnon
AC.
New approaches and strategies for proficiency testing for next-generation sequencing-based oncology assays
.
Am J Clin Pathol
.
2022
;
157
(4)
:
478
479
.
7.
Furtado
LV,
Souers
RJ,
Vasalos
P,
et al
Four-year laboratory performance of the first College of American Pathologists in silico next-generation sequencing bioinformatics proficiency testing surveys
.
Arch Pathol Lab Med
.
2023
;
147
(2)
:
137
142
.
8.
Cusick
MF,
Clark
L,
Tu
T,
et al
Performance characteristics of chimerism testing by next generation sequencing
.
Hum Immunol
.
2022
;
83
(1)
:
61
69
.
9.
Spence
T,
Stickle
N,
Yu
C,
et al
Inter-laboratory proficiency testing scheme for tumour next-generation sequencing in Ontario: a pilot study
.
Curr Oncol
.
2019
;
26
(6)
:
e717
e732
.

Author notes

Harada and Mackinnon contributed equally to this manuscript.

The authors have no relevant financial interest in the products or companies described in this article.