ABSTRACT
To test the reliability of Lateral cephalometric radiographs (LCRs) for use in the assessment of the upper airway, hyoid bone, soft palate, and tongue.
The records of 57 healthy Chinese children from a nonhospital population (mean age = 12.6 years, SD = 0.5, 28 males and 29 females) who received two consecutive LCRs in the natural head posture were retrospectively analyzed. Fifteen linear, angular, and area measurements were used to describe the airway, hyoid bone, soft palate, and tongue. The reliability between the two LCRs was assessed with the intraclass correlation coefficient (ICC) and F-test. Errors were estimated with the Dahlberg and Bland-Altman method, and intra- and inter-assessor agreements were determined.
Measurements of upper airway and hyoid bone had excellent method reliability, intra-assessor reliability, and inter-assessor reliability (ICC > 0.8). However, the method reliability and the inter-assessor reliability for soft palate and tongue was less favorable (ICC from 0.60 to 0.96). Soft palate area and thickness were the most critical parameters. Intra-assessor reliability was greater than both method reliability and inter-assessor reliability (which were similar).
The measurement of upper airway morphology, defined as the intramural space, and of the hyoid bone position were highly reliable on LCRs of children. However, the limited reliability in the assessment of tongue and soft palate area may compromise the diagnostic application of LCRs to these structures.
INTRODUCTION
Lateral cephalometric radiographs (LCRs) have been widely used as a screening tool for children with suspected sleep-disordered breathing,1 which may be related to metabolic, cardiovascular, and neurocognitive morbidity in young people.2 The diagnostic application of LCRs has been recognized by a meta-analysis concluding that there was reduced sagittal width of the upper airway in children with obstructive sleep apnea.3
LCRs have been used to investigate the intramural airway spaces,1,4 tongue,5 soft palate,1,5 and supporting structures, such as the hyoid bone,1,4 mandible,1,4 and cervical vertebrae.1 Some of these structures may be difficult to identify, and, for example, the use of a radiopaque paste has been advocated to highlight the tongue contour.6 In addition, there are specific criticisms of the use of LCRs in upper airway assessment because images are obtained with subjects in an upright position, which obviously differs from the sleeping position. Nevertheless, it is acknowledged that LCRs can discriminate between obstructive sleep apnea and snoring independently from the position of the subject,7 confirming its potential relevance as a screening method. In fact, it is recommended that mouth-breathing children be sent for a sleep assessment if their superior pharyngeal airway space appears small on an LCR.4 Yet, although the intrinsic static nature of LCRs raises concerns about their reliability, there is limited exploration of this—a few reports among adults8,–10 and no previous study performed in children.
The present study aimed to determine the reliability of LCRs in the assessment of the upper airway in children in order to identify which, if any, variables were reliable for potential use in the clinical diagnosis and assessment of treatment outcomes in the management of sleep-disordered breathing.
MATERIALS AND METHODS
Subjects
In the study of Cooke,11 published in 1986, stratified sampling was performed on 11 randomly selected schools in Hong Kong; 618 children were recruited to receive an LCR. A subgroup of 57 children (28 male, 29 female; mean age = 12.6 years; SD = 0.5), without previous orthodontic treatment and receiving LCRs with the same protocol, was included in the present study. LCRs showing mouth opening, swallowing action, tongue not in rest position, evident changes in head posture, or tissue falling outside the frame were excluded (Figure 1).
The present study was approved by the Institutional Review Board of the University of Hong Kong / Hospital Authority Hong Kong West Cluster (UW12-405).
Acquisition of LCRs
LCRs were taken with an analogic X-ray unit (GE1000, General Electric, Boston, Massachusetts, USA) with cephalometer (CI-2, Wehmer, Lombard, Illinois, USA). Subjects were in natural head posture (NHP),12 and ear posts were used to stabilize the position. One LCR was taken at baseline (T1), and a second LCR (T2) was taken either after 5–10 minutes (subgroup A) or 60–100 minutes (subgroup B). Only LCRs taken with the described protocol were selected from the original sample collected by Cooke.11
Variables and Tracing
Cephalometric analysis was carried out with a software program (CASSOS, Soft Enable Technology Limited, Hong Kong SAR), and linear measurements were corrected according to a magnification of 8.75% for midsagittal structures. Area measurements were made with graphical software (ImageJ13 ) and corrected according to a magnification of 18.27%. Figure 2 and Table 1 illustrate the points and lines used to identify the variables, and Figure 2 and Table 2 illustrate the variables.5,14,15
Measurements were obtained by a primary assessor (G.X., an orthodontist) and a secondary assessor (G.M., an orthodontist) after an initial calibration on 10 LCRs. The primary assessor conducted assessments at T1 and T2. After a washout period of about 1 month, both the primary assessor (T1′) and the secondary assessor (T1″) repeated 25 of the T1 measurements.
Four data sets were analyzed: one including the first LCR measured by the primary assessor (T1), one with the first LCR remeasured by the primary assessor (T1′), one with the first LCR measured by the secondary assessor (T1″), and one with the second LCR measured by the primary assessor (T2). In the assessments, T1-T2 represents the airway change, T1-T1′ the intra-assessor difference, and T1-T1″ the inter-assessor difference.
Sample-Size Calculation
The sample size was calculated allowing the intraclass correlation coefficient (ICC) to identify a significant agreement > 0.8, with a power of 80% and a significance level of 5% (two sided).16 The required sample was n = 49 and, given the retrospective nature of the study, all 57 children were included.
Data Analysis
The normality of the data distribution was verified with the Shapiro-Wilk test. Differences in the T1-T2 changes between subgroup A and subgroup B were compared with the Student's t-test for independent samples.
First, for comparison analysis, the mean directional difference (DD)17 was calculated (DD = XT1 – XT2). For normally distributed data, a one sample Student's t-test was used to compare mean DD to zero. For not normally distributed data, a one sample Wilcoxon signed-rank test was used to compare median DD to zero. Significant differences in DD indicate systematic bias between groups, and effect size was estimated through the standardized directional difference (SDD),17 calculated by dividing DD by the standard deviation of the T1 measurements (SDD = DD / SDT1). SDD was considered small if close to ±0.2, medium if close to ±0.5, and large if close to ±0.8 or above.18
For assessing differences not accounting for positive and negative signs, the absolute difference (AD) was calculated. Dahlberg's error was calculated,19 and the Bland-Altman method20 was used for graphical illustration of the agreements between T1 and T2 measurements.
Finally, for agreement analysis, the single measure ICC for absolute agreement was employed.21 ICC was considered poor if < 0.5, fair from 0.5 to 0.7, good from 0.7 to 0.8, excellent if > 0.8, and perfect if = 1.0.21 The F-test was used to assess if ICC was > 0.8 (T1-T2).22
Statistical analysis was performed with statistical software (SPSS Statistics 20, IBM, Armonk, New York, USA) at the significance level α = 0.05.
RESULTS
From the initial pool of 618 patients, 550 were excluded because LCRs were taken with different protocols, resulting in 68 patients. Of these, five were excluded because the tongue was not in the rest position, four patients were not occluding, one was swallowing, and one had the hyoid bone out of the X-ray film, resulting in 57 patients included in the analysis. No patient showed evident changes in the head posture.
No difference was present between first and second LCRs (T1-T2) between the subgroup A and B, meaning that taking LCRs at intervals of 5–10 minutes or 60–100 minutes was not relevant, and data sets were merged.
Method Reliability
The comparison of the upper airway between the two consecutive LCRs, and representing the method reliability (T1 - T2), is shown in Table 3. Only two measurements showed a DD statistically different from zero, but they were < 1.0 mm. The ICC ranged between 0.60 (fair) and 0.96 (excellent). Three linear parameters (AH-CV, AH-FH, and PM-UPW) showed ICC significantly > 0.8, and the upper airway area (UAA) was also worth mentioning (P = 0.059) (Figure 3) which showed few outsiders in the Bland-Altman plots.
Intra-assessor and Inter-assessor Reliability
The comparison of the two sets of measurements made by the same assessor on the same LCR, and representing the intra-assessor reliability (T1 - T1′), is shown in Table 4. The ICC ranged between 0.86 and 1.00 (excellent).
The comparison of the two sets of measurements made by the two assessors on the same LCR, and representing the inter-assessor reliability (T1 - T1″), is shown in Table 4. The ICC ranged between 0.66 (fair) and 0.99 (excellent).
Overall Reliability
Only the AD and the Dahlberg error of the method reliability and the inter-assessor reliability showed values which may have clinical relevance (Tables 3 and 4). Measurements of the upper airway and the hyoid bone had excellent method reliability (ICC from 0.82 to 0.96), intra-assessor reliability (ICC from 0.99 to 1.0), and inter-assessor reliability (ICC from 0.94 to 0.99). However, the assessment of the soft palate and the tongue area had somewhat lower method reliability (ICC from 0.60 to 0.86) and lower inter-assessor reliability (from 0.66 to 0.96). The soft palate area and its thickness was of particular concern (Figure 3).
DISCUSSION
Variations Due to the Assessors
The intra-assessor reliability was excellent, showing minimal differences and errors in all measurements. In fact, all measurements but one had a reliability close to perfect (Table 4). The only measurement with lower but still acceptable intra-assessor reliability was the soft palate area, whose anterior border is located where the soft palate contacts the tongue. As the two muscles have similar radiolucency and are usually in tight contact for sealing the oral cavity during nasal breathing, distinguishing their borders may be challenging. Accordingly, a small but increased systematic bias (SDD) was associated with measurement of the soft palate area and the tongue area, which were the two more critical (Table 4). These results were in agreement with Malkoc et al.,10 who reported a high intra-assessor reliability in the linear measurements of the upper airway, hyoid bone, soft palate, and tongue. Similarly, Juliano et al.4 reported a perfect or substantial intra-assessor reliability in the linear measurements of the upper airway and hyoid bone. Additionally, Pirila-Parkkinen et al.1 showed excellent results for linear and angular measurements of the upper airway, hyoid bone, and soft palate. Thus, very little of the overall LCR reliability might be affected by intra-assessor variations.
Although the inter-assessor reliability of upper airway and hyoid bone were excellent, it decreased to good for the tongue area, and fair for the soft palate thickness and area. Accordingly, the AD in the tongue area was relevant, and a “medium” to “large” systematic bias (SDD) was present in the tongue area and soft palate thickness, respectively. The study findings confirmed that the area of contact between the two structures was critical but that the airway patency, which is the area of greater clinical relevance, was not affected.
Factors Affecting the Morphology of the Analyzed Structures
In adults, changes in the NHP and cranio-cervical inclination affect the hyoid bone position23 and upper airway morphology.23,24 Thus, although using the NHP allows good reliability in cephalometric analysis in children,25 minor variations are not controllable and may have influenced the measurements (Figure 4). In addition, although patients with mouth opening were excluded (Figure 1) in the present study, slight changes in the mandibular posture did not determine exclusion and may have affected the position of the hyoid bone.9
Tongue posture and swallowing primarily affect upper airway morphology. During swallowing, a chain of muscular activities is triggered26 and has a generalized effect on the airway. For this reason, patients swallowing or showing evidence of tongue movement were excluded (Figure 1). However, some variation in the tongue position cannot be prevented and inevitably affected the results (Figure 4).
Overall, in order to control these confounding variables, LCRs used to assess the upper airway should be taken in NHP, natural neck posture, light dental contacts in centric occlusion, during normal inspiration, without swallowing, and with the tongue in the rest position.
Overall Reproducibility of the Analyzed Structures
In general, given the excellent intra-assessor agreement, the variations between the two LCRs reported in this study should be mainly attributable to real morphologic changes. Furthermore, since no differences were found in LCRs taken at a 5- to 10-minute interval or at a 60- to 100-minute interval, the time interval may not affect their reliability, in agreement with a previous study in adults.10
The upper airway and hyoid bone measurements showed excellent reliability on the two consecutive LCRs (Figure 3). In addition, the DD, although significant in the case of the retropalatal oropharyngeal airway space (P = .004), was never clinically relevant, and the systematic bias was low or medium (Table 3).
In particular, the variables in which reliability was significantly higher than excellent were the nasopharyngeal airway space, horizontal position of the hyoid bone with respect to the vertebrae, and vertical position of the hyoid bone with respect to the Frankfort plane (Figure 3). These findings were in disagreement with Stepovich,8 who found the reliability of the position of the hyoid bone to be questionable. However, beside the very limited sample size of this previous study, the examinations were made when subjects were seated.8 In fact, Malkoc et al.10 by using the NHP in standing patients found good reproducibility of the hyoid bone position.
Conversely, the tongue measurements showed lower reliability, ranging from good to excellent, and the soft palate showed even poorer results (Figure 3). Accordingly, higher absolute differences and errors were present up to 3.1 mm and 92 mm2, confirming these two movable structures to be the most critical. Although radiopaque pastes can enhance the visibility of the tongue borders,6 the effect is limited to the tongue dorsum and might be necessary to adopt different imaging techniques, such as magnetic resonance imaging27 for proper assessment of these structures.
This said, the poor reliability in the soft palate measurement was in contrast with a study by Malkoc et al.10 However, this former study was performed in adults, and hyoid movements9 and oropharyngeal reflexes26 are different in children. Furthermore, although the Pearson correlation coefficient used by Malkoc et al.10 is appropriate for detecting linear associations, it does not properly represent the reliability.
Limitations
Reliability is fundamental in order to use LCRs for diagnosis. If the LCR produces the same results multiple times and independently from the assessor, it can be considered reliable. However, reliability is independent from validity, which expresses, for example, whether the obstruction measured on an LCR is representative of the real obstruction of the patient. The present study aimed at investigating the reliability of LCRs and did not consider its validity for the diagnosis of upper airway obstruction. In particular, the diagnosis and treatment of upper airway disorders in children require a comprehensive medical approach,2 and it is important to critically analyze the potential role of LCRs in such contexts.
CONCLUSIONS
The measurement of the upper airway morphology, considered the intramural space, and the hyoid bone position were highly reliable on LCRs of children.
The limited reliability in the assessment of tongue and soft palate may compromise the diagnostic application of these parameters on LCRs, and different imaging methods might be advisable.
The reliability of LCRs taken in the NHP, under gentle occlusion, and by instructing the patient to refrain from swallowing, may be affected by such factors as minor variations in head posture, neck posture, and tongue movements. It is important to minimize these variables in order to improve its clinical relevance.
ACKNOWLEDGMENTS
The authors declare no conflict of interest. We are grateful to Ms Samantha K. Y. Li for her assistance with the statistical analysis. We also thank Dr Michael S. Cooke for the collection of lateral cephalograms during the preparation of his PhD thesis, which made possible the present retrospective study.
REFERENCES
Author notes
Postgraduate student, Department of Orthodontics, Dental School, University of Brescia, Brescia, Italy.
Research Assistant, Orthodontics, Faculty of Dentistry, The University of Hong Kong, Hong Kong SAR, and Attending Physician, Department of Orthodontics, Shenyang Stomatology Hospital, Shenyang, People's Republic of China.
Clinical Professor, Dental Public Health, Faculty of Dentistry, The University of Hong Kong, Prince Philip Dental Hospital, Hong Kong SAR.
Associate Professor, Orthodontics, Faculty of Dentistry, The University of Hong Kong, Prince Philip Dental Hospital, Hong Kong SAR.
Honorary Research Associate, Orthodontics, Faculty of Dentistry, The University of Hong Kong, Prince Philip Dental Hospital, Hong Kong SAR.
Assistant Professor, Dental Materials Science, Discipline of Applied Oral Sciences, Faculty of Dentistry, The University of Hong Kong, Prince Philip Dental Hospital, Hong Kong SAR.
Clinical Assistant Professor, Orthodontics, Faculty of Dentistry, The University of Hong Kong, Prince Philip Dental Hospital, Hong Kong SAR.