Objective: To evaluate the reliability of landmark identification in posteroanterior cephalometrics.
Materials and Methods: A literature search was conducted to identify all articles concerning landmark identification error in the frontal radiograph. Electronic databases (PubMed, Web of Science, Cochrane Database, PubMed Central, and HubMed) were searched. Abstracts that appeared to fulfill the initial selection criteria were selected, and the full-text original articles were then retrieved and analyzed. Only articles that fulfilled the initial selection criteria were finally considered. Their references were also hand searched for possible missing articles from the database searches.
Results: Twelve abstracts met the initial inclusion criteria, and these articles were retrieved. From these, eight were immediately rejected because of methodological issues. Only the four articles remaining seemed to fulfill the selection criteria, but two articles were later rejected, one because no landmark identification error mean values were provided and the other because of the sample. Only one article fulfilled the inclusion and exclusion criteria of this study. Midline landmarks were more reproducible than bilateral skeletal landmarks.
Conclusion: Only one study fulfilled the additional inclusion and exclusion criteria. Few studies exist about the random error in localization of landmarks in posteroanterior cephalograms, and several methodological issues affected these few studies. Thus, future well-designed studies are needed to allow the orthodontist to choose the most appropriate cephalometric analysis.
Orthodontic diagnosis is mostly based on the use of cephalometric radiographs as a diagnostic tool. Among these radiographs, the posteroanterior cephalogram (PAC) is important in evaluating transverse skeletal and dentoalveolar relationships for a correct quantification of bilateral structural problems.1 The PAC, in fact, contains important diagnostic information that allows the evaluation of patients with functional, dentoalveolar, and/or facial asymmetries.2,3 However, there are some limitations with the use of PAC, including the difficulty in reproducing head posture and errors in identifying landmarks.4,5
The errors in cephalometric analysis are, generally, composed of projection errors,2 which arise from the geometry of the radiographic and random errors. These latter errors involve tracing, landmark identification, and measurements.6 The random errors mainly arise due to uncertainty involved in locating specific anatomic landmarks on the radiograph since computer-aided cephalometric analysis has eliminated the mechanical errors in drawing lines between landmarks and in measurements with a protractor.
In turn, the precision with which any landmark may be identified depends on a number of factors. Landmarks lying on a sharp curve or at the intersection of two curves are generally easier to identify than points located on flat or broad curves. Points located in areas of high contrast are easier to identify than points located in areas of low contrast. Superimposition of other structures over the area of the landmark in question reduces the ease of identification. Precise written definitions describing the landmarks and clinicians' training reduce the chance of interpretation error.2
Regardless of the clinical or research application, it is critical to know the reliability of PAC landmarks. However, although there is a need for selecting landmarks on PAC with optimum reproducibility, no comprehensive review was found in the literature. The purposes of this systematic review are to analyze the reproducibility of landmarks identification in PAC and to evaluate its clinical significance.
MATERIALS AND METHODS
To identify all studies that examined the reliability of landmark identification in posteroanterior cephalometrics, a literature survey was carried out by searching the following electronic databases:
ISI WoS Science Citation Index Expanded (http://portal.isiknowledge.com),
Cochrane Database of Systematic Reviews (http://www.cochrane.org/reviews/index.htm),
PubMed Central (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? db=pmc), and
The survey covered the period from October 1975 to April 2007 for PubMed, from February 1994 to April 2007 for ISI, from December 1994 to April 2007 for PubMed Central, and from January 1987 to April 2007 for PubMed.
To develop search terms for databases, the Medical Subject Heading (MeSH) database was used to look for MeSH terms for cephalometry. According to this search, the term cephalomet* was crossed with a combination of the following key words to identify articles of interest: cephalogram*, posteroanterior, frontal, landmark*, reproducibility, reliability, accuracy, and error*.
The following inclusion and exclusion criteria were chosen initially to select potential published abstracts: posteroanterior cephalometrics, landmark reproducibility, and no case reports. No language restriction was applied during the identification process of the published studies.
Eligibility of the selected studies was determined by reading the abstracts of the articles identified by the search. At this stage, no attempts were made to identify studies that did not report results as measurement errors for x and y coordinates, respectively, or other limitations concerning materials and methods, which will be described. It was considered improbable that the abstracts would report enough information about this criterion, and so relying too heavily on them might exclude some useful articles. All article abstracts that appeared to meet the initial inclusion criteria were selected, and the actual articles were collected.
The articles ultimately selected were chosen with the following additional inclusion criteria: randomly selected PACs, identification errors on the x-axis and y-axis, appropriate statistical analysis applied to interpret the findings, standardized cephalometric technique, and/or equipment conditions. The exclusion criteria were only graphical representation of reliability (diagram and/or scattergram) without any mean value reported, recordings in a noncoordinate system (x- and y-axes),7 studies on dry skulls with metal markers,1,3,8–10 reproducibility of measurements and distances,8,11 and projection errors.12,13
Two reviewers (Drs Leonardi and Annunziata) independently assessed all the articles with respect to the inclusion and exclusion criteria, and the Kappa score measuring the level of agreement was .92 (very good). A consensus was reached regarding which articles fulfilled the final selection criteria and were finally included in the systematic review. The reference lists of the retrieved articles were also hand searched for additional relevant articles that might have been missed in the database searches.
Hand searching was conducted also on the following journals from 1984 to April 2007: European Journal of Orthodontics, American Journal of Orthodontics and Dentofacial Orthopedics, British Journal of Orthodontics, International Journal of Adult Orthodontics and Orthognathic Surgery (first published in 1986), Journal of Clinical Orthodontics, and Clinical Orthodontics and Research (first published in 1998).
The search results and the final number of abstracts selected according to the initial selection criteria from the various databases are provided in Table 1. The initial inclusion criteria for this systematic review were landmark reproducibility and/or reliability in PACs.
A different number of hits was found depending on the electronic database selected. PubMed identified most of the abstracts with 82, followed by ISI Web of Knowledge with 22. The remaining databases had only a few hits (Table 1). No study was identified by hand search. From the total abstracts identified in the electronic databases, only a relatively small percentage fulfilled the initial inclusion criteria. In fact, only 12 articles of the 82 identified by PubMed fulfilled the initial selection criteria. Of these 12 articles, 8 were immediately rejected because of methodological issues1,3,7–10 (Table 2). Finally, only 4 articles seemed to fulfill additional selection criteria in some part of their study, but they were later excluded for several reasons. The study by El-Mangoury13 described only the intraexaminer reliability of 13 cephalometric landmarks on 40 PACs on two occasions and reported the mean and standard errors for each point. However, the study was not selected because of the sample that consisted of 40 clear PAC radiograph films and was therefore not randomly selected. Another point of bias in this study was that any magnitude of error associated with the equipment that allowed landmark registration was not carried out or reported.
The study by Major et al2 described the intraexaminer and interexaminer reliability of 22 bilateral skeletal and dental landmarks and 8 midline skeletal landmarks on PACs in a sample of dry skulls and patients' PACs. As findings were presented separately for dry skulls and patient PACs, it was possible to evaluate separately the data referred to patients and those found on dry skulls. Unfortunately, no mean values but only standard deviations for landmark identification errors were presented in this study.
The study by Athanasiou et al14 evaluated the intraexaminer and interexaminer error of 14 bilateral skeletal and dental landmarks and six midline landmarks on 30 patients' PACs. Unfortunately, data for the intraexaminer reliability were presented graphically, and no mean value was provided. Data on the interexaminer reliability were presented as a standard deviation for each landmark in X and Y directions, respectively, for each examiner.
Therefore, both of these latter two studies were not selected, as the mean values were not reported by either investigation. In fact, any attempt to take into account or to compare standard errors and/or standard deviations would give only the dispersion of data around a mean but not the exact magnitude of a landmark identification error on x- and y-axes.
Only one study15 fulfilled the additional inclusion and exclusion criteria. This study measured interexaminer error in a sample of 20 dry skulls. The landmarks included four midline landmarks (anterior nasal spine, upper incisal midpoint, lower incisal midpoint, and Pogonion),2,15 bilateral skeletal landmarks (the extreme superior orbital margin, the intersection of the zygomaticofrontal suture and the lateral orbital margin, the most superior aspect of the condyle, and the antegonial notches), and one bilateral dental landmark (the lowermost buccal cusps of the right and left maxillary molar). The results showed that the variation in interexaminer point identification was relatively large, and the degree of error varied greatly according to the point concerned and to the vertical and horizontal orientation (eg, the error in detecting Pogonion was much higher in the vertical coordinate, 2.4 ± 1.62, than in the horizontal coordinate, 0.4 ± 0.42). On the other hand, in the case of condylar points and frontozygomatic suture points, the level of error increased sharply in both the vertical and horizontal directions.
PAC is used by researchers and clinicians for quantitative (and qualitative) evaluation of the craniofaciodental symmetry in the frontal plane. Apart from the quality of the cephalogram, the radiographic setup, and the accuracy of the digitizing equipment, the impact of such an evaluation depends mostly on the accuracy with which an examiner can localize the relevant landmarks.14
Although there are claims on the poor reproducibility of landmarks on PAC,2,13,14 no review was found in the literature. Therefore, this systematic review was carried out to summarize the findings reported in the literature and to enable the orthodontists to look at their cephalometric numbers while being aware of possible variations.11 In fact, a comprehensive review of the landmark identification error in both the horizontal and vertical directions is essential in establishing a valid analysis.2
Landmarks with a large horizontal identification error should be avoided in transverse measurements. Similarly, landmarks with a large vertical identification error should be avoided in measuring the vertical structural relationship.2
Among 12 studies concerning the reliability of landmark identification in PACs, only one study fulfilled the final selection criteria for this systematic review. Eight studies were discarded at first, and three investigations2,13,14 were discarded later. Only one study, which was carried out in dry skull, was included in this systematic review. However, conclusions from this study should be taken with care; in fact, it should be underlined that findings from this material underscore the landmark identification errors, as cephalometric points are detected much more easily on dry skulls. In fact, in patient radiographs, the soft tissue reduces the sharpness of the hard tissue image, and skeletal and dental angular errors are greater in the presence of soft tissue.16 These differences are up to four times larger for some measurements in the presence of soft tissue.16
There were several reasons to exclude studies from this investigation. Among these, the most important reasons were the use of dry skulls with metal markers for the easier detection of the cephalometric landmark1,3,8,10 and the absence of any mean value of landmark identification error for each point2,15 but only the standard deviation2,14 and standard error.2,13 Therefore, even though the studies are well planned and some graphic representations of errors are provided, the lack of mean values made it impossible to know the exact deviation/distance of that point from the gold standard. This is necessary to carry out a comparison between data from different studies or to integrate results from several studies. The only knowledge gained from these studies is that some points are far from the mean value and others are closer.
Therefore, the statement that the identification of dental landmarks is more difficult than the identification of mastoid, latero-orbitale, and antegonion, even if conceivable, should be confirmed.
Further studies that evaluate PAC points on both analog and digital radiographs are required. Meanwhile, the new three-dimensional cephalometry method will be well established and widely used.
Although several investigations have evaluated the random error in the localization of landmarks on lateral cephalograms, only a few studies exist regarding this aspect in PACs.
Several methodological issues affected these few studies, and therefore, only one study was included in this systematic review.
The mean interexaminer error was lowest for midline points and greater for bilateral skeletal landmarks.
Limited evidence is available about random errors in the localization of landmarks on the PAC. Future well-designed studies of both digital and analog radiographs are required to enable the orthodontist to choose the proper cephalometric analysis.
Corresponding author: Dr Rosalia Leonardi, Department of Orthodontics, University of Catania, Via S. Sofia n 78, Catania, Italy (email@example.com)