Abstract
The purpose of this study was to assess reproducibility of cephalometric measurements in cephalograms obtained by three dentomaxillofacial radiology clinics. Forty lateral cephalometric radiographs were selected and sent at different times to three different clinics for cephalometric analyses. Each clinic digitized the radiographs with the same resolution, and landmarks were located with the mouse pointer directly on the digitized radiographic image on the screen. Three cephalograms were obtained from each radiograph, totaling 120 analyses. Data were analyzed with analysis of variance. Of the 32 factors studied, reproducibility of results was satisfactory for only four factors: position of maxilla relative to anterior cranial base, inclination of occlusal plane relative to anterior cranial base, position of lower incisor relative to nasion-pogonion line, and soft-tissue profile of face (P < .05). Differences in cephalometric measurements were present and such differences were significant for most factors analyzed. The different cephalometric measurements obtained by the three dental radiology clinics were not reproducible.
INTRODUCTION
Cephalometric analysis is based on the identification of anatomic landmarks, some of which are difficult to identify. Therefore, some landmarks are more reproducible than others, and absolute accuracy can hardly be achieved because all identifications are subject to some degree of error. The difficulty in identifying cephalometric landmarks is associated with the fact that the images of anatomical structures overlap and that some landmarks are paired with one found on each side of the face. Consequently, they often appear as double, noncoinciding images on lateral radiographs.1–4 However, the value of cephalometric analyses depends on the accuracy of measurement techniques because errors in recordings may lead to an incorrect diagnosis.5–8
Dentomaxillofacial radiology clinics in Brazil usually trace the cephalograms and send them, together with the corresponding radiographs, to the orthodontists or oral and maxillofacial surgeons who will use them to establish diagnoses and plan therapeutic interventions. Several studies in the literature have shown the existence of variability in interobserver landmark identification depending on different individual conceptions of landmark definition and perception of landmarks.9–11 According to Trpkova et al12 and Rudolph et al,5 the validity of any measurement obtained from cephalometric radiographs depends on the reproducibility of cephalometric landmarks. The quality of the radiograph, the conditions under which measurements are made, and the skill and training of the person who traces the cephalograms affect the magnitude of error in landmark identification.
Lau et al,10 however, argued that the degree of error probably depends on individual conceptions of landmark definition and perception of landmark location rather than on training or experience. Kamoen et al13 studied the reproducibility of interobserver cephalometric tracings and concluded that there were significant differences in the identification of anatomic landmarks. They also found that tracing accuracy is a limiting factor in cephalometry and that the variation for each landmark is dependent on the quality of the cephalogram.
The purpose of the cross-sectional study presented here was to study the reproducibility of cephalometric measurements obtained by three different dentomaxillofacial radiology clinics.
MATERIALS AND METHODS
Forty radiographs were selected from patient files; selection criterion was the technical quality of radiographs. All radiographs were taken with the same x-ray device (Orthophos 3 Ceph/60–80 kV, 10 mA). Three clinics (A, B, and C) located in the city of Porto Alegre, Brazil, were randomly selected for the analyses.
Each clinic digitized the radiographic image and analyzed one lateral cephalogram for each of the 40 radiographs, and a total of 120 cephalograms was obtained for this study. Each cephalogram had 32 cephalometric measurements (Table 1) corresponding to the analysis of one radiograph, and 3840 measurements were analyzed.
The Radiocef system (Radio Memory Ltda, Belo Horizonte, Brazil) was used to prepare the cephalograms. This system uses an A4 desktop scanner with a transparency adapter to digitize the radiographic images. The digital image resolution was 150 dpi, 8 bits. Landmarks were located with the mouse pointer directly on the digitized radiographic image on the screen.
To ensure that the analyses were performed as usual by the clinics, observers were blinded to the fact that analyses were part of a study. Therefore, a sample of what routinely happens in those clinics was obtained. No criterion standard was established for the accuracy of cephalograms prepared by the different clinics. The purpose of this study was to determine the reproducibility of the results and not to judge which clinic was right or wrong.
Analysis of variance (ANOVA) for three or more paired samples was used to compare each separate factor for the three clinics, according to the SPSS:GLM routine (general linear model repeated measures). For the multiple comparison procedures, a significance level of 5% was established.
RESULTS
The results of ANOVA analyses for each factor under study are shown in Table 2. The most important results are described below.
Only four factors did not show any differences between the three clinics: S-N.A; S-N.Occl; /1-NPog; H.(N-B).
A significant difference was found between the three clinics (A and B, A and C, B and C) in eight factors: S-N.D; (S-N).(Go-Me); 1/.NS; 1/.NA; FMIA; FMA; A-(V-T); CD (Vigorito cephalometric discrepancy).
A significant difference was found between clinics (A and B, A and C) in four factors: 1/./1; 1/-Orb; H-Nose; IMPA.
A significant difference was found between clinics (A and B, B and C) in nine factors: (N-Pog).(Po-Orb); N-A.Pog; S-N.B; A-N.B; S-N.Gn; 1/-NA; /1.NB; (Go-Me).(V-T); F.(V-T).
There was a significant difference between clinics (A and C, B and C) in four factors: (Go-Gn).Occl; genial tubercle; Iii-(V-T); H.(V-T).
A significant difference was found between clinics A and B in two factors: /1-NB; /1-Line I.
Results for the Pog-NB factor were significantly different between clinics A and C.
The maximum differences between the clinics were, in general, very high (Tables 3 through 5). Differences of up to 11.33 mm were found for linear measurements. Differences of up to 20.13° were found in angular measurements for the FMIA factor. The greatest difference found for Vigorito's CD was 17.70 mm (Table 4).
DISCUSSION
In computer-aided cephalometry, which uses digitized radiographs and specific software for tracings and measurements, the chance of error lies in locating and marking cephalometric landmarks. When the issue of identifying these landmarks is under discussion, questions about the accuracy of cephalometry are raised.5,9
Several authors have investigated the difficulty in landmark identification in studies that compare, for example, cephalograms traced from two consecutive radiographs obtained from the same patient;14,15 computerized and manual cephalometry;6,16 radiographic cephalometry determined for two samples, one of dry skulls and one of patients;17 radiographic cephalometry with or without steel ball markers;7,18 cephalometry on digitized and conventional digital radiographs;19 and accuracy on digital radiographs with varying image resolutions.11
Few authors have focused their studies on interobserver variation in landmark identification.10 Results have shown a high rate of interobserver errors in the identification of landmarks. In this study, we found interobserver variation in the identification of landmarks in cephalometric analyses conducted by different radiology clinics.
In Brazil, the current practice is for dental radiology clinics to prepare cephalograms. It has been noticed that the dentists who require such cephalograms hardly ever question the values of cephalometric measurements and base their decisions on measurements that may be wrong. The idea for this study arose from this observation.
Cephalometric errors may be classified as errors of acquisition, identification, or measurement.11 In this study, acquisition and measurement errors were controlled by the use of radiographs of good technical quality and by the fact that all analyses were computer-aided digitally. The results of this study revealed a significant difference in most of the measurements made by the three different clinics for the same radiograph. Such results strongly suggest that landmarks must have been identified in different sites. These findings are in agreement with those reported by several authors who have studied interobserver variation in the preparation of cephalograms.10,13,17,20 Significant differences were found between all clinics in eight factors: S-N.D; (S-N).(Go-Me); 1/.NS; 1/.NA; FMIA; FMA; A-(V-T); and CD (Vigorito's cephalometric discrepancy).
In some cases, the result from one clinic differed from those from the two other clinics, which had equal results. Clinics A and C showed the most agreement, with equal results for 46.9% of the factors. Clinics A and B had the least agreement (28.1%). The analysis of maximum differences in interincisal angle (1/./1) revealed differences of 18.96° (between clinics A and C), 13.85° (between A and B), and 8.63° (between B and C). For other measurements involving the long axis of the incisors, significant differences were also found: FMIA = 20.13°; IMPA = 18.06°; /1.NB = 16.28°; and 1/NA = 12.9°.
Chan et al9 also found difficulties in the identification of upper and lower incisor angulation. These differences, although present in single cases, are significant and may affect therapeutic decisions. A difference of up to 17.7 mm was found for CD (Vigorito). Cephalometric discrepancies, together with model discrepancies, may lead to misguided decisions about whether teeth should be extracted. For the factors SNA, S-N.Occl, /1-Npog, and H.(N-B), there was agreement in the location of landmarks, as shown by the fact that there were no statistically significant differences between the results for the three radiology clinics. This shows that the landmarks were located in similar sites by the three clinics.
Further studies to investigate cephalometry should be conducted not only because of the importance of this diagnostic resource but also because such studies may develop new methods with a lower probability of error.
CONCLUSIONS
Our results indicate a very low reproducibility in the identification of cephalometric landmark points and angles for the majority of cephalometric measurements investigated in the study.
Of all the cephalometric measurements evaluated, the ones that showed the least reproducibility were S-N.D; (S-N).(Go-Me); 1/.NS; 1/.NA; FMIA; FMA; A-(V-T); and CD (Vigorito).
The aspects discussed above underscore the need to carefully evaluate the measurements provided by cephalometric analyses. They also show that observer calibration is fundamental for scientific purposes because of the high probability of error.
REFERENCES
Author notes
Corresponding author: Dr. Heraldo Luis Dias da Silveira, Departamento de Cirurgia e Ortopedia, Universidade Federal do Rio Grande do Sul, Riva Ramiro Barcelos, 2492, Porto Alegre, Rio Grande do Sul, Brazil 90035-003 ([email protected])