Abstract
Objective: To evaluate intra-examiner and inter-examiner reliability of 3D CBCT-generated landmarks previously used in traditional 2D cephalometry.
Materials and Methods: Twenty-four CBCTs NewTom 3G (Aperio Services, Verona, Italy) were randomly selected from patients participating in a clinical trial involving maxillary expansion treatments. The principal investigator located the landmarks five times, and four other investigators located the same landmarks once. Intra-examiner and inter-examiner reliability values were determined using intraclass correlation coefficients (ICCs). To assist in interpretation of the clinical significance of landmark identification differences, average mean differences for x, y, and z landmark coordinates were determined from the repeated assessments. Landmarks then were separated into groups with respect to the region they represented and then were compared via repeated measures ANOVA and multiple comparisons via Bonferroni corrected α.
Results: Intra-examiner and inter-examiner reliability for x, y, and z coordinates for all landmarks were acceptable, all being greater than 0.80. Most of the mean measurement differences obtained from trials within the principal investigator in all three axes were less than 1.5 mm. Inter-examiner mean measurement differences generally were larger than the intra-examiner differences.
Conclusions: Based on this, the best landmarks for use in verifying expansion treatment results are Ekm, buccal surface, and apexes of upper molars, upper premolars and upper canines, and buccal surfaces of lower molars and lower canines. Foramen Spinosum, ELSA, Auditory External Meatus, and Dorsum Foramen Magnum demonstrated adequate reliability for determining a standardized reference system.
INTRODUCTION
Rapid maxillary expansion treatments have been used widely to correct maxillary transverse deficiency problems in adolescents. Several systematic reviews on maxillary expansion treatments and their effects on dental and skeletal structures have been published.1–4 Skeletal and dental changes produced through maxillary expansion have almost always been verified through two-dimensional (2D) cephalometric radiographs. This method has significant limitations in that these radiographs are subject to projection, landmark identification, and measurement errors.5,6
Advances in the use of three-dimensional (3D) imaging software have permitted important changes in the perception of 3D craniofacial structures. For these reasons, a trend toward changing imaging technology from traditional 2D analog films to 3D digital imaging systems is under way. The challenge for clinicians is to understand and interpret 3D imaging.7 Currently, no specific guidelines have been put forth about how to analyze this type of 3D image, and interpretation limitations still exist or are unknown. For this reason, new standards are required, and clinicians need special training to deal effectively with 3D craniofacial images.
The purpose of the present study was to evaluate intra-examiner and inter-examiner reliability of 3D CBCT-generated landmarks that have been considered in previous publications for which traditional 2D cephalometry had been used to diagnose the need for or outcome of maxillary expansion.
MATERIALS AND METHODS
CBCT scans obtained from patients participating in a clinical trial involving maxillary expansion treatments (group with maxillary expanders and control group) at two different time points (baseline and 6 months) were used for this study. Twenty-four CBCTs were randomly selected from the total pool; half of them were taken from each timeline. No subject would have more than one CBCT included.
CBCT scans were taken using the NewTom 3G (Aperio Services, Verona, Italy) at 110 kV, 6.19 mAs, and 8 mm aluminum filtration. Each image was obtained from 360 slices and was converted to DICOM format with the NewTom software. When AMIRA software (AMIRA™, Mercury Computer Systems Inc, Berlin, Germany) was used, the DICOM format images were rendered into a volumetric image. Sagittal, axial, and coronal volumetric slices, as well 3D reconstruction of the image, were used to determine landmark positions (Figure 1). In this system, the XY-plane moves from top to bottom, the XZ-plane moves from front to back, and the YZ-plane moves from left to right. The predetermined coordinate system and origin (0, 0, 0) established by AMIRA for each image were used and were the same for every examiner. The principal investigator located the landmarks five times on different days, with images performed at least 1 week apart. Four other investigators also located the landmarks once for each image. Each investigator located markers, and it was suggested that they stop once they were feeling tired and continue another day, to reduce exhaustion effect. Spherical markers of 0.5 mm diameter were placed, indicating the position of the landmark and with the center of each marker in the exact location of the landmark. A description and a definition for each landmark used are given in Table 1.
Intra-examiner and inter-examiner reliability values were determined using ICCs. To assist in the interpretation of the clinical significance of landmark identification differences, average mean differences (landmark identification error) for x, y, and z landmark coordinates from repeated assessment within the same examiner (five trials) and another between examiners (five examiners) were summarized, and descriptive statistics was applied. Thereafter, landmarks were separated into groups with respect to the region they represented and were compared using repeated measures ANOVA and all pairwise comparisons using the Bonferroni method.
RESULTS
Intra-examiner reliability for x, y, and z coordinates for all landmarks was greater than 0.97 with 95% confidence interval (CI, 0.96, 0.99). Inter-examiner reliability for x, y, and z coordinates for all landmarks was greater than 0.92 (CI, 0.87, 0.96), with the exception of the x-components of the auditory external meatus left 0.84 (CI, 0.61, 0.94), auditory external meatus right 0.90 (CI, 0.73, 0.96), orbit left 0.83 (CI, 0.52, 0.93), and orbit right 0.80 (CI, 0.49, 0.92) landmarks.
Mean measurement differences obtained from trials within the principal investigator in all three axes were less than 1.5 mm except Piriform right, which was 1.53 mm in the z coordinate, and the highest mean difference obtained (Table 2). AEM left, AEM right, and Zm left had more than 1.0 mm mean difference in the x coordinate. No landmarks had mean differences greater than 1.0 mm in the y coordinate. In the z coordinate A, B, Piriform left, Piriform right, Ekm left, and Ekm right had more than 1.0 mm mean difference.
Intra-examiner Absolute Mean Measurement Difference (mm) in Coordinates of Landmarks Based on Five Trials

Mean measurement differences obtained from trials of the five examiners generally were larger than the intra-examiner differences (highest being 3.61 mm for OrL in the x-axis) (Table 3). In the x coordinate, Orbit left and Orbit right had mean differences greater than 2.5 mm, and Zm left, Zm right had mean differences greater than 1.5 mm. In the y coordinate, no landmarks had a mean difference greater than 2.5 mm. AEM left, Piriform left, Orbit left, Orbit right, MB 36 apex, MB 46 apex, and ANS had mean differences greater than 1.5 mm. In the z coordinate, Piriform left and Piriform right had mean differences greater than 2.5 mm, and A, Orbit left, Ekm left, and Ekm right all had mean differences greater than 1.5 mm.
Inter-examiner Absolute Mean Difference (mm) in Coordinates of Landmarks Based on Five Examiners

Landmarks were divided into groups corresponding to the region they represent, and repeated measures ANOVA testing was applied for each of these groups to find a statistical difference between landmarks. Table 4(a–h) shows the landmarks that presented statistical differences based on Bonferroni pairwise comparisons between landmarks in the region they represent. AEML and AEMR presented the greatest statistical differences in the x-axis and y-axis when compared with other landmarks in the same region (Table 4a and 4b). In the skeletal facial region, most landmarks presented statistical differences with other landmarks in all axes (Table 4c and 4d). In the maxillary dental landmarks, 26B and 26A were the ones that presented the greatest statistical differences from others (Table 4e and 4f). In the mandibular dental landmarks, 36A and 46A presented the greatest statistical differences from other landmarks in the same region (Table 4g and 4h).
DISCUSSION
The use of CBCT or CT overcomes limitations present in traditional 2D cephalometric analysis, in which there is overlapping of structures leading to landmark identification errors that affect determination of real changes present in maxillary expansion treatments.8–10 Several studies9–12 have analyzed 3D changes using CBCTs and CTs in maxillary expansion treatment. A common factor among all these studies is the use of only linear and angular measurements instead of a 3D coordinate system to verify changes in maxillary expansion treatments in a true 3D format.
Swennen et al13 understood the need for a 3D-based measurement analysis when using a 3D Cartesian system. They used common 2D cephalometric landmarks to determine a standardized reference position from which to locate skulls, followed by determination of 3D position changes using different landmarks. The disadvantage of their approach was the use of landmarks located in skull structures prone to growth-based changes (landmarks forming Frankfurt Horizontal plane, Sella, and Nasion) that could occur concurrently with treatment changes, thus potentially skewing the results depending on the time of follow-up of patients.
Tausche et al14 used a similar 3D Cartesian system approach to determine changes after maxillary expansion treatments. The advantage of their approach was the use of landmarks present in the cranial base to standardize the skull position. However, the study did not reach its full potential by reporting changes in 3D but instead reported changes with respect to linear and angular measurements.
Published reliability values with respect to coordinates for landmarks used in lateral and posteroanterior cephalometrics are not very common. Some studies5,15–17 did report reliability values for x and y coordinates for several points used in this study. One meta-analysis presented an overall analysis of reliability values for some lateral cephalometric landmarks.18 The range of reliability values identified in the present study was generally similar to that reported in other 2D studies. A tendency found in the studies was that points such as Orbitale, Piriform, and Porion (in this study known as AEM) showed the largest errors, similar to the present results.
Based on the present results, several factors influencing choice of landmarks in analysis of CBCT images can be identified. Ideally, the landmarks would be identified easily in the 3D images without the assistance of tomographic slices. Landmarks with small identification errors are located in areas of high-density contrast with adjacent structures and are located on sharply curved or pointed structures. Landmarks located in the center of a foramen are also good choices. Landmarks used as superimposition references should be located in nongrowing structures and at a distance from the region being influenced by treatment, to reduce the effect of individual landmark placement. Ideally, several reference landmarks will be chosen that are located at a significant distance from each other and in different planes of space to obtain a 3D coordinate system. Constructed landmarks based on two distant well-defined landmarks are also useful. Landmarks also need to be identified in the “region of interest” that will be representative of the structure being evaluated. These landmarks should be identified easily at any stage of growth and treatment. The choice of these landmarks should take into account the identification error in the axis of interest. Finally, the choice of landmarks should be customized based on the type of treatment or growth effects that are being assessed.
Inter-examiner mean differences were greater than intra-examiner differences. This can be explained based on the examiner's interpretation of landmark definition and individual anatomic variations. Furthermore, operator experience using CBCT images and AMIRA software may have influenced the study results by having a greater impact on inter-examiner reliability.
The clinical significance of error in repeated landmark location is difficult to define and will depend on the purpose of analysis. For diagnostic purposes, population norms are compared with a specific patient; inter-examiner reliability should be carefully considered, and variations higher than 1.5 mm could be considered clinically significant. When different time points are analyzed, the impact of cumulative landmark location errors should be considered. In situations where the effect of growth or treatment intervention is being evaluated with superimposition, intra-examiner reliability is of primordial importance. In this case, landmarks with variations higher than 1.0 mm would be of clinical significance. The size of the structure being investigated and the magnitude of change to be detected will also influence the clinical significance of landmark identification error. Landmark identification error may be different in x, y, and z coordinates, and some landmarks may be useful for detecting change in one axis but not in another. For example, Piriform has low intra-examiner landmark identification error in the transverse dimension but high error in the vertical dimension. Piriform may be useful for assessing changes in nasal width in maxillary expansion, but should be avoided in assessing vertical change.
New landmarks are available from CBCT imaging that could not be visualized with traditional 2D imaging. These landmarks would give us new tools for diagnosis and measurement of growth and treatment changes and may overcome limitations found in 2D imaging. For example, dental pulp chambers can be used to assess 3D changes in tooth position. Nerve foramina in the maxilla and mandible (infraorbital foramen, mental foramen, inferior dental nerve foramen, anterior nasal foramen) are also possible choices. The validity of skeletal and dental landmarks to represent the region of interest would have to be determined by comparing diagnostic measures from untreated normal populations vs untreated abnormal populations. Large standard deviations or no difference in landmark locations between these two different populations would suggest that it is not useful for diagnostic analysis.
In the present study, intra-examiner wise, the majority of landmarks presented measurement errors less than 1 mm in each axis. Ekm left and right and Piriform left and right presented measurements errors of between 1 and 2 mm in the z-axis. It was difficult to locate them in the 3D view because parts of these structures are formed with thin bone that may not be clearly visualized with CBCT. Piriform landmarks are located in the outer portion of convexity of the nasal cavity. The bone in this area is thin and of low density, thus visualization of this bone is very dependent on the threshold used in the software. Some teeth apexes are difficult to visualize because of low-density contrast with adjacent bone. The auditory canal is a cylinder-type structure, and in the x-axis dimension, AEM could be placed in a variety of positions along the length of the canal. Zm left and right are difficult to locate in patients who do not present with a distinct zygomaxillary notch.
With respect to inter-examiner measurement error, most landmarks presented measurements errors less than 1 mm for each coordinate. Landmarks that presented the highest measurement errors were AEML and AEMR in the x-axis, Piriform left and right and Ekm left and right, all in the z-axis, and Orbit left and right in the x and y axes. A common factor between all these landmarks is that they are located in structures formed by relatively flat surfaces, thus making it difficult to pinpoint the exact location. Several apex landmarks presented measurement errors of between 1 and 2 mm. This could be considered clinically important depending on the use of these landmarks, especially if they are used for torque expression or root resorption when measured changes in these aspects are miniscule.
Mesio-Buccal apex of lower molars presented some problems in identification since this root curves and joins the mesial lingual root, making it difficult to pinpoint the exact apex tip of the root of interest. A, ANS, PNS, and Prosthion are landmarks that should be used cautiously because immediately after expansion, when the suture is separated, the bone present in the midportion of the maxilla is nonexistent or very thin. In the mandible, B point can present momentary changes in the vertical dimension as a result of bite opening, because of biting into the hyrax appliance and not because of treatment-related changes. These same points could be useful for evaluating changes when the palatal suture is completely ossified.
Upon reviewing the previous explanations and descriptions of problems related to several landmarks, it is not surprising to observe the results obtained when the statistical significance of each landmark is evaluated in its region of interest.
Selection of landmarks for use in 3D image analysis should follow certain characteristics. Overall, from the landmarks measured, the best landmarks in each region of interest to be used for diagnoses and treatment with maxillary expansion are EkmL, EkmR, 16B, 16A, 14B, 14A, 13B, 13A, 23B, 23A, 24B, 24A, 26B, 26A, 36B, 33B, 46B, and 43B. Figures 2 and 3 illustrate some of the most reliable landmarks. It should be noted that with the exception of two landmarks (EkmL and EkmR), all are located in dental structures and thus would serve to measure only dental changes, while skeletal change information would be limited. Landmarks FSL, FSR, ELSA, AEML, AEMR, and DFM fulfill the use of establishing a reference standardization system because of their location, reliability, and stability at the ages when patients require conventional orthodontic treatment.
CONCLUSIONS
Ekm, buccal surface, and apexes of upper molars, upper premolars, and canines, and buccal surfaces of lower molars and lower canines are adequate landmarks for usage in verifying expansion treatment results.
Foramen Spinosum, ELSA, AEM, and Dorsum Foramen Magnum demonstrated adequate reliability and could be used for determining a standardized reference system; however, additional analysis is required to verify their adequacy.