The College of American Pathologists periodically surveys laboratories to determine changes in cytopathology practices. We report the results of a 2016 survey concerning thyroid fine-needle aspiration (FNA).
To provide a cross-sectional survey of thyroid cytology practices in 2016.
In 2016, a survey was sent to 2013 laboratories participating in the College of American Pathologists Non-Gynecologic Cytology Education Program (NGC-A) requesting data from 2015–2016 on several topics relating to thyroid FNA.
A total of 878 laboratories (43.6% of 2013) replied to the survey. Radiologists performed the most thyroid FNA procedures in most laboratories (70%; 529 of 756), followed by endocrinologists (18.7%; 141 of 756), and most of these were performed under ultrasound guidance (92.1%; 699 of 759). A total of 32.6% of respondents (251 of 769) provided feedback on unsatisfactory rates for nonpathology providers who performed FNA. Intraprocedural adequacy assessment was primarily performed by attending pathologists (77.4%; 490 of 633) or cytotechnologists (28.4%; 180 of 633). Most laboratories used the Bethesda System for Reporting Thyroid Cytopathology (89.8%; 701 of 781) and performed molecular testing based on clinician request (68.1%; 184 of 270) rather than FNA diagnosis. Correlation of thyroid excisions with prior cytology results most often occurred retrospectively (38.4%; 283 of 737) and was used for pathologist interpretive quality assurance purposes.
These survey results offer a snapshot of national thyroid FNA cytology practices in 2016 and indicate that standardized cytology terminology is commonly used; pathologists perform most immediate adequacy assessments for thyroid FNA; laboratories use correlation statistics to evaluate pathologists' performance; and molecular tests are increasingly requested for indeterminate interpretations, but reflex molecular testing is rare.
As in decades past, fine-needle aspiration (FNA) remains the primary means to assess the nature of thyroid nodules. Thyroid FNA has been a resounding success, resulting in reduced thyroidectomy rates, reduced cost of care, and an appropriately increased proportion of thyroidectomies performed for malignant disease.1–3 Nevertheless, more than 50% of currently resected thyroid nodules are benign, an indication of the persistent limitations of thyroid FNA.
In the past, use of diagnostic terminology had been inconsistent and confusing. The quality of samples is highly variable, and interobserver reproducibility has been low.4–6 Even under the best of circumstances, there is a group of patients in whom the FNA findings are indeterminate, resulting in anxiety and subsequent surgery that may have been avoidable. Proposed methods to improve laboratory practices include use of uniform diagnostic criteria, terminology, specimen handling, and quality assurance policies. The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC), first introduced in 2009, provides a set of uniform diagnostic criteria, nomenclature, and clinical management recommendations for FNA cytology reports.7 To assess the current state of laboratory practices, we analyzed responses to a supplemental questionnaire provided in the first mailing of the College of American Pathologists (CAP) Non-Gynecologic Cytopathology Education Program in 2016. The survey posed a series of questions regarding the use of standard terminology, specimen acquisition, ancillary testing, and quality assurance in the practice of thyroid cytopathology. This was compared to our prior, similar survey results from 2011.8
MATERIALS AND METHODS
Members of the CAP Cytopathology and Quality Practices Committees collaborated on the formulation of a supplemental questionnaire examining 2015–2016 thyroid FNA cytology practices to accompany the spring mailing of the 2016 Non-Gynecologic Cytopathology Education Program. The Cytopathology Committee created the initial survey and forwarded it to the Quality Practices Committee for input and modification. All survey questions were reviewed by a biostatistician for question clarity and validity. The first part of the survey consisted of standard demographic questions released with all CAP surveys, including the type of institution and the general location of the laboratory. Practice questions were selected to cover a broad range of parameters, including the application of Bethesda terminology for reporting, specimen acquisition, the use of ancillary testing, and quality assurance activities. Reporting of diagnostic categories from 2015 was optional, but we asked laboratories collecting those data to report actual values. We chose 2015 in anticipation that laboratories would have the data readily available. Tests for association between testing practices and thyroid FNA test volume were performed using Kruskal-Wallis and Wilcoxon rank sum tests, and tests for association between testing practices and institution type were evaluated with χ2 tests. Differences between the 2011 and 2016 survey results were analyzed with t-tests for testing volumes and χ2 tests for practice characteristics. A P value <.05 was considered statistically significant. All results were summarized and analyzed using SAS 9.3 (SAS Institute, Cary, NC).
Of 2013 laboratories that received the survey, 878 (43.6%) responded to at least the first question, which pertained to whether the laboratory processed thyroid FNA specimens. Only survey results from the 783 laboratories that interpret thyroid FNA specimens were included in the summary and analysis of the subsequent thyroid practice questions. Questions were posed in multiple-choice format; any question could be skipped, and multiple responses were allowed for many of the questions. Laboratories that did not interpret thyroid FNA specimens (10.8%; 95 of 878) were not expected to complete the survey. Of 783 respondents, 711 (90.8%) were located in the United States, 17 (2.2%) were in Canada, and 55 (7.0%) were international. Practice settings varied; of 669 respondents, 305 (45.6%) worked in nonprofit hospitals, 106 (15.8%) in proprietary hospitals, 69 (10.3%) in independent laboratories, 59 (8.8%) in city, county, or state hospitals, 49 (7.3%) in university hospitals, 41 (6.1%) in national or corporate laboratories, and 29 (4.3%) in Department of Defense/Veterans Administration hospitals (Table 1).
For 687 respondents, the mean number of thyroid FNA specimens examined in 2015 was 305. Tables 2 and 3 show optional data from questions related to diagnostic reporting categories. Of 783 laboratories, 235 (30.0%) reported actual data for 2015 and reported interpretations as diagnostic categories using Bethesda or an equivalent system. Table 2 shows that cases were most often reported as benign (mean, 71.4%), followed by atypia of undetermined significance/follicular lesion of undetermined significance (AUS/FLUS)/atypical (mean, 8.6%) and nondiagnostic/unsatisfactory (mean, 8.6%). Follicular neoplasm/suspicious for follicular neoplasm (FN/SFN) was reported at a mean of 4.6%, followed by suspicious for malignancy (mean, 3.1%) and malignant (mean, 3.7%). Table 3 indicates that most respondents (89.8%; 701 of 781) use the Bethesda terminology for reporting thyroid FNA, and of these, 60.8% (417 of 686) make modifications, whereas 39.2% (269 of 686) use it as described in the textbook. Regarding the terminology for atypical cases, 67.2% (506 of 753) use FLUS and 65.6% (494 of 763) use AUS, whereas the remainder (26.2%; 197 of 753) use both or other terms, but multiple responses were allowed (data not shown). The most common reasons for repeating a thyroid FNA were because the specimen was nondiagnostic or unsatisfactory (76.2%; 541 of 710) or that it was atypical (67.3%; 478 of 710). Thyroid FNAs most often followed by surgery were those interpreted as malignant (93.7%; 659 of 703), suspicious for malignancy (SFM; 91.7%; 645 of 703), and FN/SFN (80.5%; 566 of 703).
Table 4 summarizes the common cytopreparation types for thyroid FNA. Of 780 respondents to a question regarding types of preparations examined (multiple responses allowed), 735 (94.2%) used direct smears, whereas 452 (57.9%) used some form of liquid-based preparation other than cytospins, and 575 (73.7%) used cell blocks. Slightly more than a quarter (27.8%; 217) used core biopsies. We asked which group typically performs most of the thyroid FNAs that the laboratory interprets. In most settings (n = 756), procedures were typically performed by nonpathologists: 529 (70%) by radiologists, 141 (18.7%) by endocrinologists, 38 (5%) by otolaryngologists, and 22 (2.9%) by pathologists. Laboratories reported that samples were most commonly obtained under ultrasound guidance (92.1%; 699 of 759) rather than by palpation (6.2%; 47 of 759); 13 (1.7%; 13 of 759) indicated that another method was used to obtain the specimen.
Table 5 summarizes questions related to on-site evaluation and shows the personnel who typically perform intraprocedural adequacy assessment. The respondents were asked to give the percent of individuals performing adequacy assessment for the practice, to total 100%. In other words, if cytotechnologists performed adequacy assessment half of the time, and attending pathologists the other half, the response would be 50% for cytotechnologists and 50% for pathologists. From Table 5, the most common individual performing assessment was the attending pathologist, who performed 100% of the assessments in two-thirds of laboratories (379 of 633; 59.9%). When sharing the responsibility with others in the laboratory, the other individual was usually a cytotechnologist, and 13.1% (83 of 633) of practices relied solely on cytotechnologists for adequacy assessment. More than 90% of the 633 laboratories reported several staff who never perform adequacy assessments. These staff were pathologists in training, including fellows (96.1%; 608 of 633); radiologists (93.7%; 593 of 633), surgeons (98.4%; 623 of 633), and endocrinologists (96.5%; 611 of 633). When asked about the amount of time required to perform an adequacy assessment for 1 specimen site or 1 part of a case, more than 70% (468 of 660) of the respondents indicated that the assessment required up to 30 minutes, and 29.1% (192 of 660) more than 30 minutes.
The survey did not assess the percentage of cases having on-site adequacy evaluation, but we did query whether feedback on unsatisfactory rates is provided to nonpathology providers. More than half of respondents (67.4%; 518 of 769) did not provide feedback to nonpathologist providers regarding unsatisfactory rates. Of those who did (32.6%; 251 of 769), most (50.2%; 121 of 241) indicated that they did this either during the procedure (29.8%; 36 of 121) or as needed (37.2%; 45 of 121) rather than at a specific interval. The other common responses were annually (17.8%; 43 of 241), monthly (17.0%; 41 of 241), or quarterly (10.8%; 26 of 241). To the question, “are thyroid FNA slides typically screened and marked by a cytotechnologist?” 62.2% (465 of 748) responded in the affirmative.
A series of questions on quality improvement activities for thyroid cytology revealed that 44.9% (342) of 762 laboratories did not have a policy of prospective or retrospective review of some or all thyroid FNA cases. Among those that did, they were roughly split between laboratories with a policy of prospective review (29.3%; 123 of 420), retrospective review (34.4%; 153 of 420), or both (34.2%; 144 of 420) in some or all cases. In 37.5% (156 of 416) of laboratories with a review policy, all malignant and atypical cases were reviewed prospectively, whereas other respondents reviewed cases at the discretion of the pathologist (35.1%; 146 of 416), retrospectively (18.3%; 76 of 416), or according to a set percentage of cases (14.7%; 61 of 416). A total of 57 laboratories indicated that only certain interpretations require review (13.7%; 57 of 416) and the remainder (13.0%; 54 of 416) used “other” review policies. Multiple responses were allowed for this question.
Cytologic-histologic correlation was performed in a variety of ways. The largest group (38.4%; 283 of 737) of laboratories conducted retrospective correlation after the surgical case was signed out, whereas in 26.3% (194 of 737) of laboratories, correlation was conducted at the time of surgical specimen sign-out. An additional 20.6% (152 of 737) conducted some combination of both prospective and retrospective cytologic-histologic correlation. The remaining labs performed correlations on specific cases only (Table 6). Most respondents (67.1%; 494 of 736) indicated that up to 25% of thyroid specimens received surgical specimen correlation in their practices, whereas 73 (9.9%; n = 736) laboratories had no in-house follow-up. Of 749 laboratories, surgical follow-up was tracked for quality assurance purposes, most often for pathologist interpretive quality assurance (45.0%; 337 of 749) and, less commonly, for cytotechnologist evaluation (8.1%; 61 of 749). Two “yes/no” questions investigated transmission of information about thyroid surgical or FNA results apart from issuing the report. Thyroid cases were presented at multidisciplinary general tumor boards for 54.0% (383 of 709), but in only 15.6% (99 of 634) of laboratories were thyroid cases discussed at a dedicated multidisciplinary thyroid/endocrine tumor board or conference.
Ancillary testing was the subject of several questions. The type of specimen most often used for immunohistochemistry and other special stains was a formalin-fixed cell block (92.7%; 480 of 518); less common were liquid-based/cytospin preparations (18.4%; 95 of 518) or direct smears (13.7%; 71 of 518). Cell blocks without formalin fixation (eg, Cellient, Hologic Inc, Marlborough, Massachusetts) were the least commonly used specimen (5.4%; 28 of 518). Of 458 respondents, 26.6% (122) indicated that immunohistochemistry was separately validated for cytology specimens. Molecular testing was offered on thyroid FNA specimens in 43.7% (338 of 774) of laboratories and primarily performed at a reference or referral laboratory (93.4%; 255 of 273) rather than in-house (10.6%; 29 of 273) or in the cytology laboratory (1.5%; 4 of 273). Molecular testing was directly associated with the volume of thyroid specimens. Laboratories that offered molecular testing had a median annual volume of 197 (n = 292) specimens compared with a median of 116 (n = 389) for laboratories that did not offer the testing (P < .001; Wilcoxon rank sum test). We found no other significant associations between molecular testing with laboratory type or volume in the analysis performed. In most settings (68.1%; 184 of 270), molecular testing was performed upon clinician request; other responses (multiple allowed) indicated that laboratories considered particular interpretations to be indications for molecular testing, including all FLUS/AUS (31.1%; 84 of 270), all FN/SFN (17.8%; 48 of 270), and all carcinomas (3.0%; 8 of 270). Pathologists directed the decision to perform molecular testing for 28.1% (76 of 270). The survey did not address the specific assays performed by the participating laboratories, but the most commonly reported molecular tests (either referred or performed in-house) were BRAF (92.2%; 224 of 243), proprietary test panels (79.0%; 196 of 248), RAS (81.9%; 185 of 226), and RET-PTC (76.4%; 168 of 220). More than half (66.9%; 182 of 272) of laboratories indicated that molecular results were incorporated into the pathology report, with 74.7% (130 of 174) including correlative information as to the significance of the reported results.
A CAP Non-Gynecologic Education Program supplemental questionnaire issued in 2011, designed primarily to investigate implementation of TBSRTC, posed questions similar to those of our survey and is used for comparison with the current survey (Table 7).8 We recognize that these comparisons are not exact because the same laboratories were not necessarily surveyed, but because the surveys were both deployed to laboratories participating in cytopathology education programs, there is likely some overlap of respondents. The 2 respondent groups also had similar thyroid volume distributions (P = .81).
Thyroid Cytology Terminology
In 2016, 783 respondents to this survey performed a mean of 305 thyroid FNA specimens annually, and in 2011 the mean number of thyroid FNAs from 777 respondents was 326,8 suggesting that there has been no significant change in the number of thyroid FNAs performed submitted to participant laboratories (P = .81). Notably, the use of TBSRTC has markedly increased from 2011 (60.9%) to 2016 (89.9%; Table 7), but fewer laboratories use it as strictly described in the text (78.3% in 2011 versus 39.2% in 2016). Our study indicates that there has been a significant increase (P < .001) in the percent of cases interpreted as benign from 2011 (mean, 62.6%) to 2016 (mean, 71.4%), along with comparable reporting of nondiagnostic/unsatisfactory specimens (2011 mean, 9.8%; 2016 mean, 8.6%; P = .06) and a significant decrease in reporting for SFN/FN (2011 mean, 7.6%; 2016 mean, 4.6%; P < .001). The mean for cases reported as SFM has decreased from 2011 (4.0%) to 2016 (3.1%; P = .007), as has the mean for cases reported as malignant (2011, 4.4%; 2016, 3.7%; P = .03). The cases interpreted as AUS/FLUS slightly increased, although not significantly, from a mean of 7.7% in 2011 to 8.6% in 2016 (P = .10). From these data, it is possible that at least some of the cases in indeterminate categories have been assigned to the benign category. Examples of reassignment that might affect these values is reporting specimens with Hürthle cells in cases of chronic lymphocytic thyroiditis and reactive processes, which can show some nuclear features of papillary carcinoma, as benign rather than SFM, and reporting cases with some microfollicles (as opposed to those with a prominent microfollicular pattern) as benign rather than SFN/FN. Jing and Michael9 found that Hashimoto thyroiditis can be misinterpreted as suspicious for papillary thyroid carcinoma because of the presence of nuclear inclusions and grooves in oncocytes, a pitfall also noted by Baloch and Livolsi.10 If these changes are due to shifts in the use of cytology terminology, then these findings are consistent with a significantly greater number of laboratories in 2016 having implemented TBSRTC than those surveyed in 2011 (P < .001), and many 2011 surveyed laboratories had indicated that they planned to adopt that terminology in the near future. However, in 2016, fewer laboratories using the system reported using the terminology as strictly described (39.2%; 269 of 686) compared with 2011 (78.3%; 353 of 451; P < .001). This is reflected in the use of AUS/FLUS. Although either term is acceptable for atypical thyroid specimens, the original intention was that laboratories choose only one term for routine use. Our study indicates that 35.5% (267 of 753) use both (Table 3) or other combinations for atypical findings. We asked, “which of the following terms for atypical thyroid FNA does your laboratory use?” and allowed laboratories to select all that apply. It may be that laboratories give pathologists the freedom to use the terminology of their choice, rather than selecting a laboratory-wide standard for atypical thyroid specimens. It is also possible that the use of AUS/FLUS as equivalent terms was initially misunderstood and pathologists used both to indicate differences in nuclear (AUS) versus architectural (FLUS) atypia. This potential problem has been clarified in the second edition of the TBSRTC by allowing for the option of subclassifying AUS into nuclear or architectural atypia in the description.7 The overall shift in diagnostic terms might also be related to the increasing use of ultrasound to select nodules for aspiration. Selective aspiration of nodules with suspicious ultrasound findings could decrease the rate of malignant diagnoses, especially if diagnostic tissue is obtained, whereas detection of smaller benign nodules amenable to aspiration might result in an increase in benign interpretations.
Our survey appears to reflect improved adherence to 2006 management guidelines11,12 for repeating the FNA in AUS/FLUS/atypical cases as supported in TBSRTC text. Most laboratories (67.3%; 478 of 710) did so, as opposed to 22% (155 of 703) that performed surgery. We did not specifically ask whether surgery followed the first or a subsequent AUS/FLUS interpretation. The trend for repeating AUS/FLUS is difficult to assess. In the 2011 survey, 32.1% (132 of 411; Table 7) were repeated after the first AUS/FLUS interpretation, but 41.8% (172 of 411) noted that management depended on the clinician, 19.2% (79 of 411) were unsure of the usual follow-up, and 6.8% (28 of 411) indicated that it was followed with surgery.8 In our study, repeat FNA also typically occurred if the interpretation was nondiagnostic/unsatisfactory (76.2%; 541 of 710). In 2016, respondents indicated that FN/SFN typically led to surgery (80.5%; 566 of 703), as did SFM (91.7%; 645 of 703) or malignant (93.7%; 659 of 703). The knowledge that SN/SFN and SFM lead to surgery may have also influenced pathologists' reporting of these cases in a climate stressing thyroid preservation and may account for the decline in these interpretations. Although not a part of the current survey, the 2011 survey queried whether laboratories incorporated management recommendations into their diagnostic reports, and most (around 70%) included recommendations for AUS/FLUS, and more than half did so for SFN, FN, and SFM. In 2011, statements regarding risk of malignancy were reported by more than half of the respondents when the diagnosis was SFN, FN, AUS/FLUS, and SFM. Both the 2006 and 2015 American Thyroid Association guidelines for the management of thyroid nodules recommend that a statement on overall risk for malignancy be included in the cytology report.11,12 This information may be important to subsequent patient management by recommending correlation with ultrasound findings and family history to influence decision-making. The clinician must have actionable information, and TBSRTC provides uniform language for expressing it. The adoption of the Bethesda system by most responding laboratories has had a positive impact, with the system demonstrating high sensitivity and negative predictive value, while driving an increasing proportion of surgeries for truly malignant disease.13
It is difficult to make a direct comparison between the 2 surveys for specimen acquisition because the questions were worded slightly differently. In 2016, laboratories reported that endocrinologists performed most of the thyroid FNAs in only 18.7% (141 of 756) of laboratories, and even fewer laboratories (5.0%; 38 of 756) reported that most procedures were done by otolaryngologists (surgeons). Pathologists performed most thyroid FNAs in only 2.9% (22 of 756) of laboratories in 2016. This represents a significant change from 2011 data, where surgeons (56.4%; 400 of 709) performed most of the FNAs, primarily by palpation (63.3%; 353 of 556), and the shift to radiologist performance (70%; 529 of 756) may reflect a trend toward ultrasound-guided procurement. In 2011, radiologists were cited as the provider performing primarily ultrasound-guided specimens (76.3%; 521 of 683).8 Not surprisingly, most intraoperative assessment for specimen adequacy in 2016 was performed by attending pathologists, followed by cytotechnologists (Table 5). Cytotechnologists performed all on-site adequacy assessment in only 83 laboratories (13.1%; 83 of 633). Even pathologists in training (which should include cytopathology fellows) were not frequently involved in on-site assessment (96.1% [608 of 633] reported that trainees never performed evaluation), but these data may reflect the small number of training programs represented in the respondent sample, or that pathologists always directly supervise trainees. These trends may also reflect billing restrictions, because only physicians can seek reimbursement for adequacy assessment. Of nonpathology professionals, radiologists are reported as performing on-site assessment most frequently (6.3%; 40 of 633), and this may mirror the high number of radiologists performing the procedure. The amount of time spent performing on-site adequacy evaluation per site/case part was usually more than 21 minutes. This highlights the major time commitment that adequacy determinations pose for pathologists, which may account for the practice of sending cytotechnologists to procedures in some settings. Regardless of the effort, it is one of the most important quality assurance steps taken in the process and can significantly improve diagnostic rates while reducing cost of care.14 Adequacy is also highly dependent on aspirator technical skills, which can be improved by collaboration with pathologists.15
Direct smears remain the most popular specimen preparation for thyroid FNA, slightly increasing in use since 2011 (Table 7), despite continued use of liquid-based preparations. In our study, direct smears are prepared by 94.2% (735 of 780) of laboratories; we did not specifically ask laboratories to break out alcohol-fixed and air-dried preparations. In the 2011 survey, 87% (642 of 738) prepared alcohol-fixed direct smears and 73.8% (545 of 738) prepared air-dried direct smears. Multiple responses were allowed for both surveys. In 2011, only 17.4% (125 of 717) used liquid-based cytology (LBC) as the primary method of FNA preparation, even though slightly more than 60% (451 of 738) included LBC preparations alone or in conjunction with other preparation types. In 2016, 57.9% (452 of 780) used some form of LBC other than cytospins with thyroid FNA; we did not specifically ask about the primary thyroid specimen preparation type. Liquid-based cytology is increasingly applied to FNA cytology, but direct smears appear to be favored by most pathologists interpreting thyroid FNA. This may be due to reliance on air-dried specimens for evaluation of enlarged nuclei and other cytoplasmic or background features, as well as improved ability to recognize watery colloid on smears as opposed to LBC specimens. The use of LBC may reflect an adjunctive role for LBC in laboratories, where needle rinses or additional passes are processed and preserved following smear preparation in anticipation of using it for ancillary studies. ThinPrep (Hologic) is used more commonly than cytospins for processing needle rinses, and it is likely that some laboratories that do not provide on-site evaluation may encourage the use of LBC media for specimen submission. Combining LBC with conventional smears has been shown to improve diagnostic accuracy by decreasing suspicious for follicular neoplasm and follicular neoplasm diagnoses while increasing benign interpretations.16 The laboratory cannot bill for both preparation types, restricting adjunct LBC use in many laboratories. The use of core biopsies to evaluate thyroid (27.8%; 217 of 780) has decreased since the 2011 survey (37%; 279 of 755; P < .001), which may also reflect guidance published in the interim that indicates core biopsies only for those solid nodules whereby there is a high suspicion for a malignant process and FNA procurement is repeatedly unsatisfactory/nondiagnostic.11,12 Thin core needle biopsy crush preparations may be useful in conjunction with FNA for successful adequacy and evaluation of previously unsatisfactory FNA cases, especially in very vascular or fibrotic nodules.17 Ultrasound-guided FNA is preferred over core needle biopsies because it is safe, reliable, and cost-effective, and can more easily sample multiple areas in a nodule.
A series of questions on quality assurance activities for thyroid cytology revealed that 44.9% (342 of 762) of laboratories did not have a policy of prospective or retrospective review of some or all thyroid FNA cases. Prospective review of atypical or malignant cases was most commonly noted, and in about one-third of laboratories the pathologist is allowed discretion in selecting cases. Only a minority (15.6%; 99 of 634) of laboratories discussed thyroid cases at a dedicated thyroid or endocrine conference, although many laboratories do include thyroid specimens in a general surgical-pathology correlation conference or tumor board. These findings may reflect the volume of thyroid surgical cases, the ability or inability to provide correlation with the cytology specimen, or the presence and interest of subspecialists, such as endocrinologists and otolaryngologists, in these conferences. Our survey did not address questions of formal intrainstitutional consultation. Second reviews may have a significant impact on thyroid FNA interpretations, particularly when the review is focused on specific diagnostic categories. Kuijpers et al18 found major (12.8%; 28 of 218) and minor (27.1%; 59 of 218) discrepancies in selected borderline cases subjected to second review by expert cytopathologists during a 2-year period, with histologic concordance in the final consensus diagnosis in 95.5% (208 of 218) of cases. Davidov et al19 found high discordance rates in thyroid FNA cases diagnosed as indeterminate (63%; 81 of 129) compared with overall diagnoses (34%; 113 of 331). Those findings likely reflect the difficulty of indeterminate cases and pathologist variability in applying criteria and rendering definitive interpretations. Review of cases in group or multidisciplinary conference is an educational approach to quality assurance,20 and it may mitigate the effect of interobserver variability and reduce indeterminate diagnoses21 through refinement of diagnostic criteria among observers.
Most of the surveyed laboratories perform cytologic-histologic correlation, with most (38.4%; 283 of 737) “routinely” performing it retrospectively after the surgical sign-out. There is room for improvement in this activity because approximately 15% (108 of 737) performed only focused correlations. The usefulness of cytologic-histologic correlation has been demonstrated in nongynecologic specimens generally22 as well as in thyroid FNA cytology in particular.23 One caveat to performing thyroid cytologic-histologic correlation is that there is a high degree of interpretive discordance of follicular-patterned lesions both in cytology and histology.24–26 Unfortunately, there is no standardized method of performing cytologic-histologic correlation, nor of measuring the outcomes, so interinstitutional comparisons are difficult to make. Of interest, investigating the surgical outcome for a thyroid FNA diagnosis is used primarily to monitor a pathologist's interpretative accuracy (45%; 337 of 749) as opposed to cytotechnologist accuracy (8.1%; 61 of 749), but is not usually incorporated into pathologists' Ongoing or Focused Professional Practice Evaluation. This could be a lost opportunity for laboratories to improve the interpretive accuracy of individuals.
With the initial implementation of thyroid FNA, the number of thyroidectomies in the United States was drastically reduced, but with the advent of thyroid ultrasound and resulting detection of occult nodules, surgery for indeterminate nodules became the mainstay for diagnosis. Most of these lesions were benign, and patients experienced complications, including hypothyroidism. This recognition spurred health care researchers to seek molecular markers that could be used to triage patients and reduce unnecessary surgery, and surgeons began to adopt lobectomy for the histologic characterization of uncertain FNA diagnoses. In our survey, 43.7% (338 of 774) of laboratories performed molecular testing on thyroid FNAs, and most ancillary testing was performed upon clinician request (68.1%; 184 of 270) as opposed to serving as a triage test for particular cytologic interpretations. Some of our 270 respondents considered “indeterminate” interpretations to be indications for molecular testing, particularly FLUS/AUS/atypia (31.1%; 84 of 270) and FN/SFN (17.8%; 48 of 270) and, least commonly, carcinoma (3.0%; 8 of 270). In most cases (93.4%; 255 of 273), these tests were performed at a reference laboratory. In 2011, only 4.7% (20 of 422) of laboratories reported performing any molecular testing on thyroid FNA specimens; our study suggests increased acceptance of molecular testing. At that time, most of the laboratories that performed molecular testing on thyroid FNAs reported doing so in cases classified as suspicious, malignant, and/or AUS/FLUS, similar to our survey, but the total number of 2011 laboratories reporting was small (n = 20). The lower rate of testing in cases interpreted as malignant in our study (3%; 8 of 270) might reflect correlation with ultrasound findings, cost-controlling measures, or confidence in the FNA interpretation prompting immediate triage to partial or complete thyroidectomy. Certain mutations, such as RAS and BRAF, may also be associated with specific histologic subtypes or clinical behaviors.27 Although not specifically recommended, published guidelines have tentatively suggested molecular testing as a triage vehicle for indeterminate nodules; this approach has the potential to greatly reduce unnecessary thyroidectomy because most indeterminate nodules are benign.12 The rate of indeterminate diagnoses varies from 10% to 25% of all thyroid FNAs, and the risk of malignancy in cases classified as indeterminate varies from 15% to 30%.13 Most laboratories incorporated molecular results into the cytopathology report, and three-quarters provided correlative information as to their significance. Collation of pertinent results and correlative information into one area or report may prevent duplicate test orders and enhance clinical interpretation, and is another area that laboratories may leverage to improve client services.
Available ancillary tests currently include immunohistochemistry, gene mutation analysis, gene expression profiling, microRNA profiling, and next-generation sequencing. Each has been shown to have the potential to provide increased clarity when applied to indeterminate thyroid FNA samples in studies of variable size and quality.28–31 A study by Baca et al32 indicates that qualifying atypia as architectural (eg, microfollicle formation) or cytologic (eg, nuclear features of papillary carcinoma) is reflected by differences in the performance of the Afirma Gene Expression Classifier test (Veracyte Inc, San Francisco, California), with specimen atypia that shows only architectural features having a lower incidence of malignancy.32 Refinement of cytologic interpretation through molecular correlation may further reduce expense and unnecessary surgeries. Each of these ancillary molecular test options is limited by the need for additional adequate samples (cell block, dedicated aspirate, or leftover aspirate) and additional cost.
In conclusion, 2016 practice patterns in thyroid FNA cytopathology among participants of the Non-Gynecologic Cytopathology Education Program showed that the great majority of laboratories have adopted the Bethesda system or some modification of it. Specimen acquisition practices appear fairly stable, with smears performed in most cases, although there has been an increase in radiologist-procured specimens under ultrasound guidance. Although many laboratories perform second review or cytologic-histologic correlation on a portion of cases, a specific and routine quality assurance policy is present in a minority of laboratories, and procedures tend to be highly variable. Performance of on-site evaluation for adequacy is reported as typically requiring 20 minutes or more per target aspirated. An increasing number of laboratories, albeit still the minority, use some form of ancillary testing, most often for indeterminate nodules, and primarily upon clinician request. The practice of thyroid FNA cytology continues to evolve; a study of outcomes at multiple institutions that fully use the Bethesda system, routine prospective quality assurance, and ancillary testing in indeterminate cases would be of great value. In the meantime, these data provide a useful baseline for future assessment of practice patterns.
The physicians are volunteers of the College of American Pathologists committees who receive payment in kind (travel, lodging, and meals) in association with their committee work. Ms Souers, our biostatistician, and Ms Blond are employees of the College of American Pathologists. The authors have no other relevant financial interest in the products or companies described in this article.
All authors are or were members of the College of American Pathologists Quality Practices Committee, Cytopathology Committee, or College of American Pathologists staff. The opinions or assertions contained herein are the private views of the authors and do not reflect the official policy of the Department of the Army, Department of Defense, or US government.