Abstract
Objective: To evaluate the reliability of a questionnaire that assessed the expectations and experiences of adolescent patients about orthodontic treatment.
Materials and Methods: The study included two groups of patients: 30 consecutive patients (19 girls and 11 boys, mean age 14.6 years, SD 2.3 years) naïve to orthodontic treatment, and 30 consecutive adolescent patients (17 girls and 13 boys, mean age 15.1 years, SD 2.0 years) in active orthodontic treatment with fixed appliances in both jaws. A questionnaire comprising 46 items was developed, based upon focus group interviews and previous established questionnaires. The questionnaire covered the following domains: Treatment motivation; treatment expectations; pain and discomfort from teeth, jaws, and face; functional jaw impairment; and questionnaire validity. Internal consistency as well as temporal stability with the test-retest method was investigated.
Results: A majority of the questions exhibited acceptable test-retest reliability, and composite scores yielded excellent reliability for all domains. Internal consistency was acceptable and good face validity was found for all domains.
Conclusion: The questionnaire can be recommended for use in the assessment of expectations and experiences of orthodontic treatment.
INTRODUCTION
For orthodontic treatment to be successful, treatment methods must be effective, require minimal compliance, and cause minimal pain and discomfort. Current orthodontic techniques must therefore continuously be refined and new techniques developed and systematically evaluated. Besides analyzing the effectiveness of a new treatment method, it is also necessary to investigate how well patients accept the method and whether they experience any side effects. Common methods for assessing patients' experiences of pain and functional impairment during treatment are the use of self-administrated questionnaires that incorporate different scales such as the visual analogue scale (VAS) and the verbal rating scale (VRS).
Previous research on patients' experiences during orthodontic treatment has observed that pain and discomfort are reported mainly in the first week after insertion of an orthodontic appliance.1,2 However, other studies have reported pain periodically throughout orthodontic treatment.3,4 The degree of pain and discomfort can be explained not only by force application and different types of appliances but also by emotional, cognitive, and environmental factors, including culture, gender, and age.5–8 It has been shown that previous memories of pain or fear of pain aggravate the experience of discomfort related to orthodontic treatment, whereas patients with a high personal perception of the severity of their malocclusion exhibit high compliance and low pain and discomfort.9
For generally applicable conclusions to be drawn, the reliability and validity of clinical and subjective measurements must first be determined.
Few studies have evaluated the reliability and validity of questionnaires in a young population receiving regular orthodontic treatment,10,11 and therefore it is important to analyze whether questionnaires are adequate, well understood, and easy to complete by this patient group. For this purpose, qualitative methods can be complementary and useful tools when orthodontic treatment is explored from a patient's perspective through a questionnaire. Focus group interviews are an example of a qualitative method that has been predominantly used in sociological research but recently also in medicine and dentistry.12,13
It was hypothesized that a questionnaire whose design was largely based on focus group interviews was reliable and valid. Thus, the aim of the study was to evaluate the reliability and validity of a questionnaire that assessed the expectations and experiences of orthodontic treatment in adolescents.
MATERIALS AND METHODS
Subjects
The study included two groups of patients: 30 consecutive patients (19 girls and 11 boys, mean age 14.6 years, SD 2.3) who were to enter orthodontic treatment and 30 consecutive adolescent patients (17 girls and 13 boys, mean age 15.1 years, SD 2.0) in active orthodontic treatment with fixed appliances at the Orthodontic Clinic in Gävle, Sweden. The ethics committee of Uppsala University, Sweden, approved the protocol and the informed consent form, according to the guidelines of the Declaration of Helsinki. The patients and their parents signed an informed written consent.
Design
The investigation consisted of a self-reported questionnaire, divided into five separate domains, concerning the motivation of adolescent patients to undergo orthodontic treatment and their expectations and experiences of orthodontic treatment. The questionnaires were completed twice at a 1- to 2-week interval, and two investigators were available to explain the questions and to check the questionnaires for completeness and legibility. About 10–15 minutes were needed to complete the questionnaire.
One form of reliability of a questionnaire is the characterization of temporal stability, and the most common approach is to administer the questionnaire on two separate occasions separated by an adequate time interval so that the measured circumstances are stable. This approach is called test-retest reliability.14 Reliability of a questionnaire can be assessed for single questions or for summary scores for the complete questionnaire or its separate domains. When summary scores are measured, it is important to consider a second aspect of reliability, namely internal consistency.15 This characterizes the homogeneity of the questionnaire items, measuring one underlying construction, and expresses how well the separate questions within each part relate. Because this questionnaire was divided into separate domains, internal consistency was also evaluated.
Reliability based on summary scores for each domain was also tested separately for girls and boys and for patients yet to undergo treatment and those already in treatment.
Face validity was established by asking the patients whether the items in the questionnaire were relevant and reflected their motivation for orthodontic treatment and their expectations and experience of orthodontic treatment.
Questionnaire
To create relevant questions, a qualitative method for assessing patients' opinions about orthodontic treatment was initiated. Interviews from three focus groups with orthodontic patients who had recently completed active orthodontic treatment and one group with parents of adolescent patients in retention resulted in 4 hours of audiotaped information. The interviews were conducted by two investigators using an open-ended interview style. The participants were asked to describe why they had sought orthodontic treatment and how they had experienced the treatment process. Transcripts were made from the audiotapes and the results were analyzed and used as a basis when the questionnaires were constructed. Thus, the final questionnaire comprised 46 questions influenced by the focus group interviews and partly by other established questionnaires.16–18
The questionnaire covered the following domains (Table 1): treatment motivation; treatment expectations; pain and discomfort from teeth, jaws, and face; functional jaw impairment; and questionnaire validity.
Treatment motivation
This domain contained seven questions assessed on a VAS with the end phrases “not at all” and “very much” or “not at all” and “completely” (Table 2).
Treatment expectations
This domain contained four questions assessed on a VAS. All questions had the end phrases “not at all” and “very much” and are listed in Table 3.
Pain and discomfort from teeth, jaws, and face
This domain contained 13 questions: 10 questions on a VAS with the end phrases “none at all” and “worst imaginable,” and one question on a two-point scale (yes or no) with two follow-up questions on a four-point scale (Table 4).
Functional jaw impairment
This domain, a freestanding questionnaire, has previously been used in other populations but not for regular orthodontic patients,18 and contains 18 questions. Eight were related to mandibular function, three to psychosocial activities, and seven to eating specific foods. Each question was assessed on a four-point scale with the alternatives “not at all,” “slightly,” “much,” or “extremely” difficult (Table 5).
Questionnaire validity
This domain contained four questions, one for each domain, assessed on a VAS with the end phrases “not at all” and “very well”. For details see Table 6.
All patients in the group that had yet to enter orthodontic treatment assessed all questions, and the 30 patients already in active treatment assessed those parts of the questionnaire pertaining to treatment (questions 12–42, 45, and 46).
Statistical Analysis
Test-retest reliability
When the questions were evaluated on a continuous scale or when summary scores for questionnaire domains were measured, reliability was assessed by calculating the intraclass correlation coefficient (ICC) based on a two-way mixed analysis of variance. This is an estimate of the precision in the data obtained by multiple measurements, relating the amount of measurement error to the subject variability. An ICC above .75 indicates excellent reliability, an ICC between .4 and .75 indicates fair to good reliability, and an ICC below .4 indicates poor reliability.19
The kappa statistic (Cohen's kappa, κ) was computed to assess reliability when the questionnaire variable was measured on an ordinal or dichotomous scale. Kappa values above .80 were considered excellent, .61–.80 good, .41–.60 moderate, .21–.40 fair, and .20 and below poor.20 Kappa adjusts for the likelihood of agreement by chance. Chance agreement is high when patients can be expected to be free from symptoms. Percentage of total agreement was therefore computed for the questions measured on an ordinal or dichotomous scale.
Internal consistency
Cronbach's alpha (α) was calculated in order to estimate how consistently the subjects responded to the separate questions within each domain. Alpha values of .70 or higher were considered to be sufficient.21
RESULTS
All 60 patients filled in the questionnaire twice at an average interval of 12 days, so there were no dropouts.
Test-Retest Reliability
The reliability of the questionnaire, based on summary scores from each subject, was excellent (ICC = .84–.92) for all five domains (Table 7). There were only small differences in domain reliability between the group of patients yet to enter treatment and those already in treatment. However, a discrepancy for the domain “treatment motivation” was observed between girls and boys. The calculated reliability was good for the girls (ICC = .68) and excellent for the boys (ICC = .88).
Tables 2 through 6 present the reliability of the separate questions within each domain. ICC ranged between .21 and .88 and kappa ranged between .30 and .93. Overall, a good to excellent reliability was found. However questions 6, 18, and 19 showed poor reliability and questions 8 and 15 presented fair reliability.
In the domain “functional jaw impairment” questions 31 and 34 exhibited fair reliability (κ = .30 and .40) and questions 27, 29, and 35 exhibited moderate reliability (κ = .58, .52, and .48). These questions were, however, considered acceptable because percentage of total agreement was comparable with the other questions in this domain (Table 5).
Internal Consistency
Internal consistencies for the separate domains were α = .67–.87 at the first assessment and .63–.94 at the second (Table 7), which implies that internal consistency was sufficient for all five domains. The difference between the two assessments illustrated the sampling variability.
Face Validity
The fifth domain contained four questions, one for each questionnaire domain, wherein the patients were asked whether they considered the questions to be relevant. Very high scores (80–94) were obtained at the VAS assessment (0–100) for face validity. See Table 6.
DISCUSSION
Reliability and validity of a questionnaire is the decisive factor for evaluating its precision and the criterion for drawing generalized conclusions. We have here investigated two types of reliability, temporal stability and internal consistency. The most important findings were that a new questionnaire concerning motivation, expectations, and experiences of orthodontic treatment in adolescents had good to excellent reliability with the test-retest method and that the questions within each questionnaire domain had acceptable consistency. Good face validity was ensured by asking patients in the retention phase (focus groups) about developing the new questionnaire and by asking the patients whether they considered the questions to be relevant. The stated hypothesis was thus confirmed, that is, that a questionnaire designed largely from focus group interviews exhibited reliable and valid values. This means that adequate and applicable questions, easily understood by adolescents, could be constructed with the help of focus group interviews. Furthermore, the gender and age distribution in the study was similar to that in other studies of adolescents undergoing orthodontic treatment,9,22,23 and the results were therefore considered to be representative for these individuals.
Two types of assessment scales were used: the VRS and the VAS. Both are common methods for assessing pain and functional impairment in children and are considered to be reliable and valid methods.24In this questionnaire, both separate questions and composite scores for each domain were evaluated. It was therefore important that acceptable and sufficient consistency be ensured within each domain. Cronbach's alpha was high for the domains “functional jaw impairment” and “questionnaire validity” (α = .84–.94) and lower, but acceptable for the domains “treatment motivation,” “treatment expectations,” and “pain and discomfort from teeth, jaws, and face” (α = .63–.85). An increased number of items within these three domains would probably have improved consistency and homogeneity, but because it was important that the patients be able to assess the questionnaire relatively quickly, the number of items was restricted.
Test-retest reliability based on summary scores was excellent for all five questionnaire domains in this study (ICC = .84–.92). The domains “treatment motivation” and “treatment expectations” were assessed only by the 30 subjects yet to undergo orthodontic treatment. A probable cause for the difference in reliability for the domain “treatment motivation” between boys (excellent) and girls (good) could therefore be the small sample size.
The reliability of the domain “functional jaw impairment” was excellent (ICC = .92), which is in agreement with Stegenga,18 who used the scale with patients with temporomandibular disorders. The reliability found by Marcusson,25 however, who used the scale on adult cleft lip and palate patients, was lower (ICC = .67). To our knowledge, this scale has not been used on ordinary orthodontic patients before.
It is important to bear in mind that when questionnaire reliability is based on composite scores, one loses the opportunity to analyze details in individual questions; therefore, reliability was also tested on all individual questions. The test-retest reliability of the individual questions was acceptable overall. High reliability is, however, difficult to achieve in homogenous populations because reliability is a measure of how well the variable can distinguish between subjects. Because the subjects in our study formed a very homogenous group of healthy adolescents with no or few symptoms, this phenomenon was illustrated in a few individual questions.
To increase the range of the two domains on potential inconveniences (“pain and discomfort” and “functional jaw impairment”), it was essential that the questions be assessed both by patients who had not yet started treatment and by patients in active treatment (60 patients altogether). However, because the test-retest estimation had to be performed under similar and stable circumstances, the subjects in active treatment were assessed during the last two weeks before an appointment, usually a time interval with few symptoms of pain and discomfort, and the study group was therefore still relatively homogenous.
Three individual questions (6, 18, and 19) in this study had poor reliability, and four questions (8, 15, 31, and 34) had fair reliability. The question “Have you been properly informed about the orthodontic treatment?” had an ICC of .27 (Figure 1). It is known from other studies26 that pretreatment information is an important factor for future compliance and for pain and discomfort experiences, but because our subjects systematically scored lower at the second assessment, the reliability of this question is poor and the question will not be used in further studies. However, the poor reliability was probably an effect of an incorrect assumption that the circumstances between the assessments were stable.
It can also be stressed that the two other questions with poor reliability (18 and 19), “Do you have pain from your molars when they are in contact?” (ICC = .39) and “Do you have pain from your molars when they are not in contact?” (ICC = .21) demonstrated the problem with homogeneous data sets. These two questions could also easily be mixed up or difficult to understand, especially for patients with no previous orthodontic experience. In Figure 2, one outlier (21.5; 43) in a population with little variability decreased the ICC value from .69 to .39.
Moreover, questions 31 and 34, “If you have pain and discomfort from your teeth and jaws, how much does that affect drinking?” and “If you have pain and discomfort from your teeth and jaws, how much does that affect yawning?” had kappa values of .30 and .40, that is, fair reliability. Percentage agreements for the repeated assessments were, however, 93% and 92%, which indicates that these questions are acceptable and the discrepancy with the magnitude of the kappa statistics occurred because most subjects did not experience any difficulties (Table 8).
To ensure the legitimacy of the questionnaire, a fifth domain was added, “questionnaire validity,” which contained four questions about whether the items in the respective domains of the questionnaire reflected the subjects' opinions regarding expectations and experience of orthodontic treatment. These four questions exhibited high median values (average 89) on the VAS, which confirms that the questions were applicable and relevant.
This questionnaire was developed for a detailed scientific study of patients' experience of new orthodontic technique from decision for treatment to outcome satisfaction. It was therefore essential to establish that the questions were reliable and valid. The focus group interviews explored different aspects of treatment experiences, and to ensure that the questions asked were valid, all these aspects had to be considered. For everyday clinical use, this questionnaire is somewhat extensive, but shortening the questionnaire by selecting a few questions is not advisable because consistency and validity can then no longer be guaranteed. However, because all questionnaire domains had excellent reliability and acceptable consistency, the domains could easily be used separately as “short versions.” For example, questions 1–5 and 7–11 could be used before treatment in order to establish patients' motivation and interest. Applicable questions from the domain “pain and discomfort from the teeth, jaws, and face” could be used during orthodontic treatment to study appliance acceptance, and the fourth domain, “functional jaw impairment,” could be used to study long-term effects during orthodontic treatment.
CONCLUSIONS
A vast majority of the questions in each domain exhibited acceptable test-retest reliability, and composite scores yielded good to excellent reliability for all domains. Internal consistency within each questionnaire domain was acceptable. Good face validity was found for the domains.
The questionnaire, which was largely designed from focus group interviews, can be recommended for use in the assessment of orthodontic treatment.
Acknowledgments
We wish to express our sincere thanks to Professor Arne Halling for having inspired us to use focus group interviews. This study was supported with grants by The Centre for Research and Development, Uppsala University/County Council of Gävleborg, Sweden, and the Swedish Dental Society.
REFERENCES
Author notes
Corresponding author: Dr Ingalill Feldmann, Orthodontic Clinic, Box 57, SE-801 02 Gävle, Sweden([email protected])