Context.—The first-class hospital laboratories are required to participate in a proficiency testing (PT) program by the Ministry of Health of the People's Republic of China. The College of American Pathologists (CAP) PT programs have increasingly become laboratories' preferred choices because of the well-prepared specimens, more comprehensive programs, scientific evaluation, and useful educational opportunities involved.
Objective.—To evaluate the PT performance of our laboratory from 2007 to 2011 through the selected analytes and tests.
Design.—The PT results of 22 commonly performed tests in 15 events were evaluated. The rates of unacceptable results for all analytes and tests during the 5-year study period were compared with χ2 test. Reasons for all unacceptable results were sorted into 8 groups.
Results.—A total of 13 of 22 analytes and tests (59.1%) achieved full scores (100%), whereas 6 analytes and tests failed to reach the goal of getting a score of 80% or higher in PT. Seen from the relative distance of the results from the target as a percentage of allowed deviation, the performance of some analytes was excellent, including albumin, amylase, lactate dehydrogenase, potassium, protein, total, uric acid, red blood cell, white blood cell, and hemoglobin. Rates of unacceptable results for all analytes demonstrated a trend of decline. The top 3 reasons for unacceptable results were identification errors in morphologic tests, specimen problems, and technical problems (25.74%, 23.76%, and 14.85%, respectively).
Conclusions.—The PT performance demonstrated a trend of improvement from 2007 to 2011. Proficiency testing contributed to the improvement of laboratory performance, especially the promotion of better patient care.
Quality improvement in a modern clinical laboratory environment entails the continuous inspection and refinement of processes to ensure the efficient delivery of services that meet the needs and expectations of those who use them. Proficiency testing (PT) provides a measure of the effectiveness of laboratory quality assurance programs.1 Participation in a PT program is also a valuable adjunct to laboratory activities dedicated to the maintenance of reliable analytic methods. The PT program may facilitate continuous quality improvement if laboratory performance is presented in the context of expectations espoused by health care professionals for optimal patient care.2 Two major laboratory-accrediting organizations—the College of American Pathologists (CAP) and the International Organization for Standardization (ISO)—both require laboratories to participate in PT programs. The Ministry of Health of the People's Republic of China also requires first-class hospital laboratories to participate in PT programs. In this study, we evaluated the performance of some analytes by analyzing the PT results in recent years. There were so many satisfactory results that did verify the high-quality analysis in our laboratory, but the unsatisfactory PT results suggested potential problems of which we might not be aware in routine work. Furthermore, an understanding of root causes of those testing errors provided an opportunity for the continuous improvement of laboratory services.
MATERIALS AND METHODS
Background of the Evaluated Hospital and Laboratory
West China Second University Hospital is a university hospital affiliated with the Ministry of Health of the People's Republic of China. It is not only the largest women and children's teaching hospital in southwest China, it is also the emergent and severe case treatment center. In 2006 the Department of Laboratory Medicine of this hospital was the first laboratory of a special hospital to be accredited by ISO 15189 (Accreditation Criteria for the Quality and Competence of Medical Laboratories) in China. To date, there are almost 70 laboratories participating in CAP PT programs in China, among which 5 hospital laboratories and 11 independent laboratories had been accredited by CAP's Laboratory Accreditation Program. Our laboratory has been participating in CAP PT programs since 2007.
Data Sources
The data were extracted from the CAP PT results of the laboratory from 2007 to 2011. The data were PT scores and consisted of values between 0 and 100, representing the percentage of correct responses (usually from 5 challenges) during the testing events for each analyte. The data of the analytes included 3 event scores per year in this article.
Selection of Analytes and Tests
There were 245 analytes and tests that were covered by CAP PT surveys until 2011 and accounted for 77.8% of the total analytes and tests in our laboratory; these analytes and tests belonged to the areas of chemistry, hematology, immunology, microbiology, transfusion medicine, and cytopathology. Twenty-two analytes and tests were selected for this evaluation because they were representative of different laboratory specialties and were the most commonly performed tests in most laboratories. Some of the analytes and tests selected for this evaluation were a subset of those evaluated previously.3,4
Materials
Albumin (ALB), amylase (AMY), cholesterol (CHOL), glucose (GLU), lactate dehydrogenase (LDH), potassium (K), protein, total (TP), alanine aminotransferase (ALT), bilirubin, total (TB), triglycerides (TG), and uric acid (UA) were analyzed by the ADVIA 2400 Chemistry System (Siemens Healthcare Diagnostics, Tarrytown, New York) with the reagents and protocols from the manufacturer. For the first half of the year of 2008, the manufacturer of instrument and reagent of partial pressure of oxygen (PO2) was NOVA Stat Profile 5 (NOVA Biomedical, Waltham, Massachusetts); then, it was replaced by Gem Premier 3000 (Instrumentation Laboratory, Bedford, Massachusetts). ADVIA Centaur XP Immunoassay System (Siemens Healthcare Diagnostics, Tarrytown, New York) was used to test hormones—for example, thyroxine (T4). Red blood cell (RBC), white blood cell (WBC), and hemoglobin (HB) were tested using a Sysmex XE-2100 hematology analyzer (Sysmex Corporation, Kobe, Japan). Prothrombin time was calculated using a Sysmex CA-7000 analyzer (Sysmex Corporation) with reagents from Siemens Healthcare Diagnostics (Marburg, Germany). Hepatitis B surface antibody (anti-HBs) and hepatitis B core antibody (anti-HBc, total) were tested using a FAME Enzyme Immunoassay Analyzer (Hamilton Medical AG, Bonaduz, Switzerland) with reagents from the Shanghai Kehua biologic technology company (Shanghai, China).
Methods
Proficiency testing is a point sampling of laboratory output that is used to judge the quality of laboratory testing.2 An evaluation of unsatisfactory performance by CAP is an unexpected outcome for the laboratory. For most specialty tests (blood transfusion and cytology are exceptions), satisfactory performance in a PT event is achieved by attaining an overall testing event score of at least 80%. This study compared the PT results of the 22 analytes and tests for the total 15 events during the 5-year period. Every unacceptable result was analyzed, even if the total performance of the PT event was satisfactory.
This article analyzed the detailed results of 17 quantity analytes from the 22 total analytes and tests, expressed as the relative distance of the lab results from the target as a percentage of allowed deviation (see formula 1). If the percentage was 0, it suggested that there was no difference between our testing result and the target value. If the percentage was 100%, it indicated that the difference between our testing result and the target value reached the highest allowable deviation. If the percentage was between −100% and 100%, the result was acceptable. The closer to 0 the percentage was, the better the result was. If the percentage was negative, it meant our result was lower than the target value. On the contrary, if the percentage was positive, the result was higher than the target value. The percentage of 0 was set as the baseline in figures in this study.
Formula 1: Deviation percentage = (lab result − target value)/allowed deviation × 100%.
To evaluate whether the PT performance was improved, this article compared the rates of unacceptable results for all analytes and tests during the 5 years with the χ2 test. When a P value was less than .05, the result was considered significant.
This study summarized all of the reasons for every unacceptable result and sorted them into 8 big categories, which were random errors, identification errors in morphologic tests, specimen problems, instrument problems, methodologic problems, technical problems, wrong peer group used, and clerical errors. We aimed to find out which reason often caused the unacceptable results, and whether the PT performance was improved after technologists analyzed every unacceptable result and performed corrective actions in time.
RESULTS
The PT scores of 22 analytes and tests from 2007 to 2011 were shown in Table 1. It was seen that 13 of the 22 total analytes and tests (59.1%) achieved full scores (100%), including ALB, AMY, CHOL, GLU, LDH, K, TP, UA, T4, RBC, WBC, HB, and anti-HBs. For 3 analytes—ALT, TG, and prothrombin time—only 1 PT event score was not 100%. For 6 analytes and tests, including TB, PO2, blood parasite, blood cell identification, yeast identification, and anti-HBc, total, more than 2 PT event scores were not 100%. Unfortunately, 6 analytes and tests in 1 PT event failed to attain the passing score of at least 80%, including ALT, TB, TG, blood parasite, yeast identification, and anti-HBc, total, belonging to the areas of chemistry, hematology, microbiology, and immunology. In addition, it was found that the scores of many analytes in the first event of 2009 were quite unsatisfactory.
Figures 1 and 2 showed the detailed results of 17 quantity analytes, expressed as the relative distance of our results from the target as a percentage of allowed deviation. We could directly see the stability of the performance of different analytes from the deviation degree of the results of every event. In Figure 1, A through L, it was shown that the performance of these analytes, including AMY, LDH, UA, RBC, HB, TP, ALB, K, WBC, CHOL, GLU, and T4, was acceptable. Meanwhile, the percentages of all challenges of some analytes (AMY, LDH, UA, RBC, HB, and TP) were distributed on the two sides of 0 and were close to 0 (Figure 1, A through F). The deviation percentages of ALB, K, and WBC distributed on the two sides of 0, but the distribution of percentages in different concentrations indicated no certain trend. At the same time, the deviation percentages of CHOL, GLU, and T4 almost distributed on one side of 0. Seen in Figure 2, A through E, the deviations from the target of different challenges of TB, PO2, ALT, TG, and prothrombin time were remarkably different, and some results even exceeded the allowed deviation.
A through L, Proficiency testing results of selected analytes with better performance for the 5 years, 2007–2011. Abbreviations: ALB, albumin; AMY, amylase; CHOL, cholesterol; GLU, glucose; HB, hemoglobin; K, potassium; LDH, lactate dehydrogenase; RBC, red blood cell; TP, protein, total; T4, thyroxine; UA, uric acid; WBC, white blood cell.
A through L, Proficiency testing results of selected analytes with better performance for the 5 years, 2007–2011. Abbreviations: ALB, albumin; AMY, amylase; CHOL, cholesterol; GLU, glucose; HB, hemoglobin; K, potassium; LDH, lactate dehydrogenase; RBC, red blood cell; TP, protein, total; T4, thyroxine; UA, uric acid; WBC, white blood cell.
A through E, Proficiency testing results of selected analytes with poorer performance for the 5 years, 2007–2011. Abbreviations: ALT, alanine aminotransferase; PO2, pressure of oxygen; TB, total bilirubin; TG, triglycerides.
A through E, Proficiency testing results of selected analytes with poorer performance for the 5 years, 2007–2011. Abbreviations: ALT, alanine aminotransferase; PO2, pressure of oxygen; TB, total bilirubin; TG, triglycerides.
Over the course of 5 years there were 101 unacceptable results for all analytes in the PT. Other studies5–7 found that declining failure rates were associated with experience in performing PT, such as performing dilutions, reading forms, and mastering data entry. Table 2 showed the rates of unacceptable results for the period 2007 through 2011 in CAP surveys in the laboratory evaluated. It was shown that the rate gradually decreased with each year of experience, although there was no significant difference among the years (P = .97).
Rates of Unacceptable Proficiency Testing Results in the College of American Pathologists Surveys, 2007–2011

This study reviewed all of the reasons for unacceptable proficiency testing results, which were divided into 8 categories. The top 3 reasons were identification errors in morphologic tests, specimen problems, and technical problems, which accounted for 25.74%, 23.76%, and 14.85%, respectively, of unacceptable PT results (Table 3). Thus, we encountered the most problems regarding morphologic recognition ability in the areas of hematology and microbiology during the 5 years.
COMMENT
We moved into our new laboratory location in November 2008. There were significant changes in the location of instruments, reagents, and work flow. For example, the Sysmex HST-N 201 hematology automated line was substituted for the separate Sysmex hematology analyzer, and Siemens ADVIA work cell automation replaced the Hitachi 7600 automated chemistry analyzer (Hitachi High-Technologies Corporation, Tokyo, Japan). In addition, all of the chemistry reagents changed from open reagents to manufacturer's reagents. However, Table 1 and Figure 1 showed that these changes did not affect the PT results of those analytes and tests. The analysis quality was quite stable because of the rational new work flow. At the same time, we could believe that the choice for the new instruments and reagents was correct. The results from the new instruments were consistent with those of the old instruments. Moreover, the working speed of the new instruments was much faster than that of the old ones. Clinical laboratories need these new kinds of instruments in order to adapt to a vigorous growth in patient specimens.
We could determine the performance of every analyte and test through the PT results of these years. Table 1 shows that the result of every challenge of 13 analytes was good. We were confident with the performance of these kinds of analytes and believed in the accuracy of the patient results. For 3 analytes, each had one PT score that was not 100%. The unacceptable results might be caused by random errors, occasional instrument malfunction, or other special reasons. The results of more than 2 PT events of 6 analytes were not completely acceptable. The causes for unacceptability included many potential elements, which compelled us to conduct an in-depth investigation. For example, the results of anti-HBc, total, in 2007 were not perfect; it was found that the specimens should not be diluted. Since then, the results of all of the viral markers were excellent. Actually, PT performance for 1 event in only 6 analytes and tests was unsatisfactory, which was not typically caused by occasional errors. For example, the unsatisfactory results in the first event of 2009 had to do with the specimen problems due to custom clearance delay.
Figures 1 and 2 show the 15 event results of 17 quantity analytes, expressed as the relative distance of our results from the target as a percentage of allowed deviation. Figure 1, A through F, illustrate that the performance of the 6 analytes was excellent. The analysis accuracy of these kinds of analytes (such as AMY, RBC, and HB) in different concentrations was equally good. Seen in Figure 1, G, and I although the deviation percentages of ALB, K, and WBC distributed on the two sides of 0, the percentages in different concentrations were quite different, which indicated that the accuracy of these analytes in different concentration was inconsistent. We should probably reevaluate the linearity and reportable range of this analyte. At the same time, in Figure 1, J and K, we could see the deviation percentages of CHOL and GLU almost distributed on one side of 0, and most results were higher than the target value, which showed that the analysis system might have positive systematic error. Figure 1, L, illustrated T4 had negative systematic error. Therefore, we recalibrated the system even though all of the results were acceptable.
The data of yearly unacceptable rates suggest that the longer a laboratory participates in PT, the better its performance will be. More experience with PT likely contributed to the decrease in the rates of unacceptable results. This was similar to the reports of others who noted performance was improved as laboratories gradually gained experience in handling PT samples and reporting results.5,6,8 We believe that the decline of unacceptable rates will affect patient test performance positively.
Unsatisfactory PT results could help us improve the performance more effectively. The laboratory responded by conducting an investigation into the source of error and by modifying the procedure that produced the error, with the objective of reducing or eliminating the chances of a recurring process failure. Identification error in morphologic tests was the main reason for unacceptable CAP PT results in our laboratory. It was difficult to see some blood parasites in China (and thus not likely to be identified), such as Babesia sp, so our technologists could not identify them in the early stage of participation in the CAP PT. By learning from PT summary reports and other literature, we found that this kind of parasite was readily identified in America. Since 2009, we have not made any mistakes in identifying them. Another problem regarding morphologic recognition capability was distinguishing all types of plasmodium, which included plasmodium ovale, plasmodium vivax, and plasmodium falciparum. It was not easy to differentiate the 3 kinds of plasmodium in a single blood film, but it could be realized through the size, shape, and content of infected red blood cells and patient history. Usually the PT summary reports would list the information about the key morphologic feature of the parasite, which provided us educational opportunities. Therefore, our staff promoted the morphologic recognition ability though learning from the PT summary and literature, training, and routine check.
Specimen problem was the second main reason for unacceptable results. Since 2009 it has always taken us more than 1 month to clear customs for the CAP PT products because the products were considered special medical goods. When the kits arrived at our laboratory, the specimens had already deteriorated and were unfit for detection. That is why the results of some analytes (ALT, TB, TG, etc) were completely unacceptable in 2009 (Figure 2, A through C). We have not been able to report the results of the unstable analytes since then, so we just report the code [11] instead, which means the specimen is unable to be analyzed. This was the most headache-inducing problem related to the expansion of CAP PT in western China, and it was also the reason why we hesitated to apply for CAP accreditation. However, it is exciting to hear that the CAP Business Development Department has taken measures to solve the problem. We also tried to communicate with the government department and shorten the time for approval.
The improper specimen processing and preanalytic errors mentioned here belonged to the category of technical problems.9 Lapses in standard operating procedure accounted for the highest proportion of errors in this category. The unacceptable result for PO2 in 2010 was a typical example (Figure 2, E). The kit instructions suggested that one should hold the top of the ampule to avoid the transfer of body heat to the contents of the ampule. Just prior to use, one should shake the specimen vigorously for 10 seconds. However, the technologist reversed the specimen for more than 1 minute by holding the entire body of the ampule. This incorrect behavior changed the temperature of the specimen, which influenced the result for PO2. At the same time, CAP handling instructions indicated that the operator should wait a minimum of 1 minute after shaking the specimen to dissipate foam or small bubbles, and the specimens should be sampled as soon as the ampule was opened. When possible, one should aspirate directly into the instrument, with the sample probe near the bottom of the ampule. Actually, the operator did not wait for the bubbles to dissipate and sampled regardless of the position of the sample probe. Improper processing inevitably influenced the results. By identifying the root cause of error, the supervisor found the same erroneous operation of the technologist in calibrating the blood gas instrument. Outcomes of investigations into reasons for PT failures could be used to correct improper operations in routine work and improve the quality of patient care.10 We required the technologists to operate following the manufacturer's instructions and standard operation procedures of the laboratory in order to avoid these kinds of errors.
Before 2008 our chemistry instrument was an open system, which was composed of Hitachi instrumentation and all kinds of reagents. In particular, immunology analytes (immunoglobulin A, immunoglobulin G, rheumatoid factor, etc) were tested by biochemistry instruments, not special protein instruments. Thus, the results often could not be evaluated in the appropriate peer group.
Clerical errors, including transcription errors, unit conversion errors, and code choice errors, should also be considered. We implemented new procedures for double transcription check and double check. In addition, we adjusted the units of all analytes to international units to avoid conversion error.
CONCLUSION
Through participating in CAP PT programs for more than 5 years, we have gained a lot to guarantee the quality of patient care. Proficiency testing could play a beneficial educational role in several key areas. First, PT programs gave us informative feedback on testing performance about the sources of errors by providing written discussions and summary reports. CAP also has a Web site that offers an additional source of information. For example, it highlights important consensus guidelines that should be followed. Second, PT plays an important role by occasionally including samples with target specifications for analytes that challenge method assays at levels rarely (albeit possibly) seen in the laboratory.11 This could alert us to potential problems before they are encountered with patient samples. CAP often designed PT specimens that encompassed the clinically relevant range of analyte concentrations or mimicked specific disease states likely to be encountered. The analyte concentration of 1 of 5 specimens might exceed the upper or lower limit of the reportable range. The specimens might be borderline, weakly positive, or medium-reactive ones. CAP would eventually discuss the expected outcomes of these kinds of specimens by providing summary data. Third, CAP provided us the Linearity Surveys, which helped us monitor the calibration status. Kroll et al12 found significant differences in good PT performance for those laboratories enrolled in the Linearity Survey versus those not enrolled. Our laboratory enrolled in Linearity Surveys in the areas of chemistry, hematology, immunology, and flow cytometry. The calibration and analytic measurement range were monitored on a periodic basis, obviating the need for diluting the sample under the request of ISO 15189 accreditation criteria. Therefore, the specimen dilutions and calibration shift were not the main causes of PT as they were in the other studies.1,11,13–15 Finally, CAP PT provided more programs than other PT providers, such as antibody identification, antibody elution, India ink, and cell counts in body fluids and cerebrospinal fluid. The images of the films or photographs for morphologic examination provided were clear and well stained, and they were collected for personal training in our laboratory. This is also why so many laboratories choose to participate in CAP PT.
Proficiency testing also helped improve laboratory performance. Proficiency testing allowed identification of methods with low error and better agreement, which encourages the diagnostics industry to improve the test system. Improvements included simpler sample handling and sample addition, automatic or less frequent calibration, and more available or automated forms of quality control. Such changes contributed to fewer errors during the phase of testing, which might have translated into improved PT and laboratory performance.16 It encouraged us to select reliable and widely used methods carefully.7 A previous study found a direct relationship between increasing allowable error in quality control (QC) limits set by the laboratory and an increased rate of PT failures.11 It was suggested that we should set appropriate QC limits, which could not exceed the manufacturer's stated allowable error limits for monitoring stable performance of the assay system. Lawson et al17 also demonstrated that PT results were related to measures of performance in routine QC systems and affected by laboratory accreditation status. Through the PT experience of 5 years, we recognized that we should pay more attention to QC. And we admitted that accreditation did contribute to the improvement of PT and laboratory quality.
As a critical evaluation tool for laboratory performance, PT instructed us to routinely monitor the analytic performance and correct errors by narrowing the QC range, increasing the frequency of calibration, performing instrument function verification, and other quality improvement measures. The overall quality of laboratory medicine will be improved with increased numbers of correct test results, which promote better patient care.
References
Author notes
Xiaojuan Liu and Qingkai Dai contributed equally to this article.
Competing Interests
The authors have no relevant financial interest in the products or companies described in this article.