Objective: To evaluate the reliability of a questionnaire that assessed the expectations and experiences of adolescent patients about orthodontic treatment.

Materials and Methods: The study included two groups of patients: 30 consecutive patients (19 girls and 11 boys, mean age 14.6 years, SD 2.3 years) naïve to orthodontic treatment, and 30 consecutive adolescent patients (17 girls and 13 boys, mean age 15.1 years, SD 2.0 years) in active orthodontic treatment with fixed appliances in both jaws. A questionnaire comprising 46 items was developed, based upon focus group interviews and previous established questionnaires. The questionnaire covered the following domains: Treatment motivation; treatment expectations; pain and discomfort from teeth, jaws, and face; functional jaw impairment; and questionnaire validity. Internal consistency as well as temporal stability with the test-retest method was investigated.

Results: A majority of the questions exhibited acceptable test-retest reliability, and composite scores yielded excellent reliability for all domains. Internal consistency was acceptable and good face validity was found for all domains.

Conclusion: The questionnaire can be recommended for use in the assessment of expectations and experiences of orthodontic treatment.

For orthodontic treatment to be successful, treatment methods must be effective, require minimal compliance, and cause minimal pain and discomfort. Current orthodontic techniques must therefore continuously be refined and new techniques developed and systematically evaluated. Besides analyzing the effectiveness of a new treatment method, it is also necessary to investigate how well patients accept the method and whether they experience any side effects. Common methods for assessing patients' experiences of pain and functional impairment during treatment are the use of self-administrated questionnaires that incorporate different scales such as the visual analogue scale (VAS) and the verbal rating scale (VRS).

Previous research on patients' experiences during orthodontic treatment has observed that pain and discomfort are reported mainly in the first week after insertion of an orthodontic appliance.1,2 However, other studies have reported pain periodically throughout orthodontic treatment.3,4 The degree of pain and discomfort can be explained not only by force application and different types of appliances but also by emotional, cognitive, and environmental factors, including culture, gender, and age.5–8 It has been shown that previous memories of pain or fear of pain aggravate the experience of discomfort related to orthodontic treatment, whereas patients with a high personal perception of the severity of their malocclusion exhibit high compliance and low pain and discomfort.9 

For generally applicable conclusions to be drawn, the reliability and validity of clinical and subjective measurements must first be determined.

Few studies have evaluated the reliability and validity of questionnaires in a young population receiving regular orthodontic treatment,10,11 and therefore it is important to analyze whether questionnaires are adequate, well understood, and easy to complete by this patient group. For this purpose, qualitative methods can be complementary and useful tools when orthodontic treatment is explored from a patient's perspective through a questionnaire. Focus group interviews are an example of a qualitative method that has been predominantly used in sociological research but recently also in medicine and dentistry.12,13 

It was hypothesized that a questionnaire whose design was largely based on focus group interviews was reliable and valid. Thus, the aim of the study was to evaluate the reliability and validity of a questionnaire that assessed the expectations and experiences of orthodontic treatment in adolescents.

Subjects

The study included two groups of patients: 30 consecutive patients (19 girls and 11 boys, mean age 14.6 years, SD 2.3) who were to enter orthodontic treatment and 30 consecutive adolescent patients (17 girls and 13 boys, mean age 15.1 years, SD 2.0) in active orthodontic treatment with fixed appliances at the Orthodontic Clinic in Gävle, Sweden. The ethics committee of Uppsala University, Sweden, approved the protocol and the informed consent form, according to the guidelines of the Declaration of Helsinki. The patients and their parents signed an informed written consent.

Design

The investigation consisted of a self-reported questionnaire, divided into five separate domains, concerning the motivation of adolescent patients to undergo orthodontic treatment and their expectations and experiences of orthodontic treatment. The questionnaires were completed twice at a 1- to 2-week interval, and two investigators were available to explain the questions and to check the questionnaires for completeness and legibility. About 10–15 minutes were needed to complete the questionnaire.

One form of reliability of a questionnaire is the characterization of temporal stability, and the most common approach is to administer the questionnaire on two separate occasions separated by an adequate time interval so that the measured circumstances are stable. This approach is called test-retest reliability.14 Reliability of a questionnaire can be assessed for single questions or for summary scores for the complete questionnaire or its separate domains. When summary scores are measured, it is important to consider a second aspect of reliability, namely internal consistency.15 This characterizes the homogeneity of the questionnaire items, measuring one underlying construction, and expresses how well the separate questions within each part relate. Because this questionnaire was divided into separate domains, internal consistency was also evaluated.

Reliability based on summary scores for each domain was also tested separately for girls and boys and for patients yet to undergo treatment and those already in treatment.

Face validity was established by asking the patients whether the items in the questionnaire were relevant and reflected their motivation for orthodontic treatment and their expectations and experience of orthodontic treatment.

Questionnaire

To create relevant questions, a qualitative method for assessing patients' opinions about orthodontic treatment was initiated. Interviews from three focus groups with orthodontic patients who had recently completed active orthodontic treatment and one group with parents of adolescent patients in retention resulted in 4 hours of audiotaped information. The interviews were conducted by two investigators using an open-ended interview style. The participants were asked to describe why they had sought orthodontic treatment and how they had experienced the treatment process. Transcripts were made from the audiotapes and the results were analyzed and used as a basis when the questionnaires were constructed. Thus, the final questionnaire comprised 46 questions influenced by the focus group interviews and partly by other established questionnaires.16–18 

The questionnaire covered the following domains (Table 1): treatment motivation; treatment expectations; pain and discomfort from teeth, jaws, and face; functional jaw impairment; and questionnaire validity.

Table 1.

Domains in the Questionnairea

Domains in the Questionnairea
Domains in the Questionnairea

Treatment motivation

This domain contained seven questions assessed on a VAS with the end phrases “not at all” and “very much” or “not at all” and “completely” (Table 2).

Table 2.

Reliability of the Domain “Treatment Motivation"a

Reliability of the Domain “Treatment Motivation"a
Reliability of the Domain “Treatment Motivation"a

Treatment expectations

This domain contained four questions assessed on a VAS. All questions had the end phrases “not at all” and “very much” and are listed in Table 3.

Table 3.

Reliability of the Domain “Treatment Expectations"a

Reliability of the Domain “Treatment Expectations"a
Reliability of the Domain “Treatment Expectations"a

Pain and discomfort from teeth, jaws, and face

This domain contained 13 questions: 10 questions on a VAS with the end phrases “none at all” and “worst imaginable,” and one question on a two-point scale (yes or no) with two follow-up questions on a four-point scale (Table 4).

Table 4.

Reliability of the Domain “Pain and Discomfort From the Teeth, Jaws and Face"a

Reliability of the Domain “Pain and Discomfort From the Teeth, Jaws and Face"a
Reliability of the Domain “Pain and Discomfort From the Teeth, Jaws and Face"a

Functional jaw impairment

This domain, a freestanding questionnaire, has previously been used in other populations but not for regular orthodontic patients,18 and contains 18 questions. Eight were related to mandibular function, three to psychosocial activities, and seven to eating specific foods. Each question was assessed on a four-point scale with the alternatives “not at all,” “slightly,” “much,” or “extremely” difficult (Table 5).

Table 5.

Reliability of the Domain “Functional Jaw Impairment"a

Reliability of the Domain “Functional Jaw Impairment"a
Reliability of the Domain “Functional Jaw Impairment"a

Questionnaire validity

This domain contained four questions, one for each domain, assessed on a VAS with the end phrases “not at all” and “very well”. For details see Table 6.

Table 6.

Reliability of the Domain “Questionnaire Validity"a

Reliability of the Domain “Questionnaire Validity"a
Reliability of the Domain “Questionnaire Validity"a

All patients in the group that had yet to enter orthodontic treatment assessed all questions, and the 30 patients already in active treatment assessed those parts of the questionnaire pertaining to treatment (questions 12–42, 45, and 46).

Statistical Analysis

Test-retest reliability

When the questions were evaluated on a continuous scale or when summary scores for questionnaire domains were measured, reliability was assessed by calculating the intraclass correlation coefficient (ICC) based on a two-way mixed analysis of variance. This is an estimate of the precision in the data obtained by multiple measurements, relating the amount of measurement error to the subject variability. An ICC above .75 indicates excellent reliability, an ICC between .4 and .75 indicates fair to good reliability, and an ICC below .4 indicates poor reliability.19 

The kappa statistic (Cohen's kappa, κ) was computed to assess reliability when the questionnaire variable was measured on an ordinal or dichotomous scale. Kappa values above .80 were considered excellent, .61–.80 good, .41–.60 moderate, .21–.40 fair, and .20 and below poor.20 Kappa adjusts for the likelihood of agreement by chance. Chance agreement is high when patients can be expected to be free from symptoms. Percentage of total agreement was therefore computed for the questions measured on an ordinal or dichotomous scale.

Internal consistency

Cronbach's alpha (α) was calculated in order to estimate how consistently the subjects responded to the separate questions within each domain. Alpha values of .70 or higher were considered to be sufficient.21 

All 60 patients filled in the questionnaire twice at an average interval of 12 days, so there were no dropouts.

Test-Retest Reliability

The reliability of the questionnaire, based on summary scores from each subject, was excellent (ICC = .84–.92) for all five domains (Table 7). There were only small differences in domain reliability between the group of patients yet to enter treatment and those already in treatment. However, a discrepancy for the domain “treatment motivation” was observed between girls and boys. The calculated reliability was good for the girls (ICC = .68) and excellent for the boys (ICC = .88).

Table 7.

Reliability of the Summary Scores for the Domains in the Questionnairea

Reliability of the Summary Scores for the Domains in the Questionnairea
Reliability of the Summary Scores for the Domains in the Questionnairea

Tables 2 through 6 present the reliability of the separate questions within each domain. ICC ranged between .21 and .88 and kappa ranged between .30 and .93. Overall, a good to excellent reliability was found. However questions 6, 18, and 19 showed poor reliability and questions 8 and 15 presented fair reliability.

In the domain “functional jaw impairment” questions 31 and 34 exhibited fair reliability (κ = .30 and .40) and questions 27, 29, and 35 exhibited moderate reliability (κ = .58, .52, and .48). These questions were, however, considered acceptable because percentage of total agreement was comparable with the other questions in this domain (Table 5).

Internal Consistency

Internal consistencies for the separate domains were α = .67–.87 at the first assessment and .63–.94 at the second (Table 7), which implies that internal consistency was sufficient for all five domains. The difference between the two assessments illustrated the sampling variability.

Face Validity

The fifth domain contained four questions, one for each questionnaire domain, wherein the patients were asked whether they considered the questions to be relevant. Very high scores (80–94) were obtained at the VAS assessment (0–100) for face validity. See Table 6.

Reliability and validity of a questionnaire is the decisive factor for evaluating its precision and the criterion for drawing generalized conclusions. We have here investigated two types of reliability, temporal stability and internal consistency. The most important findings were that a new questionnaire concerning motivation, expectations, and experiences of orthodontic treatment in adolescents had good to excellent reliability with the test-retest method and that the questions within each questionnaire domain had acceptable consistency. Good face validity was ensured by asking patients in the retention phase (focus groups) about developing the new questionnaire and by asking the patients whether they considered the questions to be relevant. The stated hypothesis was thus confirmed, that is, that a questionnaire designed largely from focus group interviews exhibited reliable and valid values. This means that adequate and applicable questions, easily understood by adolescents, could be constructed with the help of focus group interviews. Furthermore, the gender and age distribution in the study was similar to that in other studies of adolescents undergoing orthodontic treatment,9,22,23 and the results were therefore considered to be representative for these individuals.

Two types of assessment scales were used: the VRS and the VAS. Both are common methods for assessing pain and functional impairment in children and are considered to be reliable and valid methods.24In this questionnaire, both separate questions and composite scores for each domain were evaluated. It was therefore important that acceptable and sufficient consistency be ensured within each domain. Cronbach's alpha was high for the domains “functional jaw impairment” and “questionnaire validity” (α = .84–.94) and lower, but acceptable for the domains “treatment motivation,” “treatment expectations,” and “pain and discomfort from teeth, jaws, and face” (α = .63–.85). An increased number of items within these three domains would probably have improved consistency and homogeneity, but because it was important that the patients be able to assess the questionnaire relatively quickly, the number of items was restricted.

Test-retest reliability based on summary scores was excellent for all five questionnaire domains in this study (ICC = .84–.92). The domains “treatment motivation” and “treatment expectations” were assessed only by the 30 subjects yet to undergo orthodontic treatment. A probable cause for the difference in reliability for the domain “treatment motivation” between boys (excellent) and girls (good) could therefore be the small sample size.

The reliability of the domain “functional jaw impairment” was excellent (ICC = .92), which is in agreement with Stegenga,18 who used the scale with patients with temporomandibular disorders. The reliability found by Marcusson,25 however, who used the scale on adult cleft lip and palate patients, was lower (ICC = .67). To our knowledge, this scale has not been used on ordinary orthodontic patients before.

It is important to bear in mind that when questionnaire reliability is based on composite scores, one loses the opportunity to analyze details in individual questions; therefore, reliability was also tested on all individual questions. The test-retest reliability of the individual questions was acceptable overall. High reliability is, however, difficult to achieve in homogenous populations because reliability is a measure of how well the variable can distinguish between subjects. Because the subjects in our study formed a very homogenous group of healthy adolescents with no or few symptoms, this phenomenon was illustrated in a few individual questions.

To increase the range of the two domains on potential inconveniences (“pain and discomfort” and “functional jaw impairment”), it was essential that the questions be assessed both by patients who had not yet started treatment and by patients in active treatment (60 patients altogether). However, because the test-retest estimation had to be performed under similar and stable circumstances, the subjects in active treatment were assessed during the last two weeks before an appointment, usually a time interval with few symptoms of pain and discomfort, and the study group was therefore still relatively homogenous.

Three individual questions (6, 18, and 19) in this study had poor reliability, and four questions (8, 15, 31, and 34) had fair reliability. The question “Have you been properly informed about the orthodontic treatment?” had an ICC of .27 (Figure 1). It is known from other studies26 that pretreatment information is an important factor for future compliance and for pain and discomfort experiences, but because our subjects systematically scored lower at the second assessment, the reliability of this question is poor and the question will not be used in further studies. However, the poor reliability was probably an effect of an incorrect assumption that the circumstances between the assessments were stable.

Figure 1.

Plot of difference against mean for question 6, “Have you been properly informed about the orthodontic treatment?” at first and second assessment. ICC = .27; mean difference, 4.4; 95 % limits of agreement, −14.6 to 23.4

Figure 1.

Plot of difference against mean for question 6, “Have you been properly informed about the orthodontic treatment?” at first and second assessment. ICC = .27; mean difference, 4.4; 95 % limits of agreement, −14.6 to 23.4

Close modal

It can also be stressed that the two other questions with poor reliability (18 and 19), “Do you have pain from your molars when they are in contact?” (ICC = .39) and “Do you have pain from your molars when they are not in contact?” (ICC = .21) demonstrated the problem with homogeneous data sets. These two questions could also easily be mixed up or difficult to understand, especially for patients with no previous orthodontic experience. In Figure 2, one outlier (21.5; 43) in a population with little variability decreased the ICC value from .69 to .39.

Figure 2.

Plot of difference against mean for question 18, “Do you have pain from your molars when they are in contact?” at first and second assessment. ICC = .39; mean difference, −0.5; 95% limits of agreement, −14.1 to 13.1

Figure 2.

Plot of difference against mean for question 18, “Do you have pain from your molars when they are in contact?” at first and second assessment. ICC = .39; mean difference, −0.5; 95% limits of agreement, −14.1 to 13.1

Close modal

Moreover, questions 31 and 34, “If you have pain and discomfort from your teeth and jaws, how much does that affect drinking?” and “If you have pain and discomfort from your teeth and jaws, how much does that affect yawning?” had kappa values of .30 and .40, that is, fair reliability. Percentage agreements for the repeated assessments were, however, 93% and 92%, which indicates that these questions are acceptable and the discrepancy with the magnitude of the kappa statistics occurred because most subjects did not experience any difficulties (Table 8).

Table 8.

Question 31, “If You Have Pain or Discomfort in Your Teeth and Jaws, How Much Does That Affect Drinking?" (κ = 0.30)a

Question 31, “If You Have Pain or Discomfort in Your Teeth and Jaws, How Much Does That Affect Drinking?" (κ = 0.30)a
Question 31, “If You Have Pain or Discomfort in Your Teeth and Jaws, How Much Does That Affect Drinking?" (κ = 0.30)a

To ensure the legitimacy of the questionnaire, a fifth domain was added, “questionnaire validity,” which contained four questions about whether the items in the respective domains of the questionnaire reflected the subjects' opinions regarding expectations and experience of orthodontic treatment. These four questions exhibited high median values (average 89) on the VAS, which confirms that the questions were applicable and relevant.

This questionnaire was developed for a detailed scientific study of patients' experience of new orthodontic technique from decision for treatment to outcome satisfaction. It was therefore essential to establish that the questions were reliable and valid. The focus group interviews explored different aspects of treatment experiences, and to ensure that the questions asked were valid, all these aspects had to be considered. For everyday clinical use, this questionnaire is somewhat extensive, but shortening the questionnaire by selecting a few questions is not advisable because consistency and validity can then no longer be guaranteed. However, because all questionnaire domains had excellent reliability and acceptable consistency, the domains could easily be used separately as “short versions.” For example, questions 1–5 and 7–11 could be used before treatment in order to establish patients' motivation and interest. Applicable questions from the domain “pain and discomfort from the teeth, jaws, and face” could be used during orthodontic treatment to study appliance acceptance, and the fourth domain, “functional jaw impairment,” could be used to study long-term effects during orthodontic treatment.

  • A vast majority of the questions in each domain exhibited acceptable test-retest reliability, and composite scores yielded good to excellent reliability for all domains. Internal consistency within each questionnaire domain was acceptable. Good face validity was found for the domains.

  • The questionnaire, which was largely designed from focus group interviews, can be recommended for use in the assessment of orthodontic treatment.

We wish to express our sincere thanks to Professor Arne Halling for having inspired us to use focus group interviews. This study was supported with grants by The Centre for Research and Development, Uppsala University/County Council of Gävleborg, Sweden, and the Swedish Dental Society.

1
Brown
,
D.
and
R.
Moerenhout
.
The pain experience and psychological adjustment to orthodontic treatment of pre-adolescents, adolescents and adults.
Am J Orthod Dentofacial Orthop
1991
.
100
:
349
356
.
2
Fernandes
,
L. M.
,
B.
Ogard
, and
L.
Skoglund
.
Pain and discomfort experienced after placement of a conventional or a superelastic NiTi aligning archwire.
J Orofac Orthop/Fortschr Kieferorthop
1998
.
59
:
331
339
.
3
Scheurer
,
P. A.
,
A. R.
Firestone
, and
W. B.
Burgin
.
Perception of pain as a result of orthodontic treatment with fixed appliances.
Eur J Orthod
1996
.
18
:
349
357
.
4
Kvam
,
E.
,
N. R.
Gjerdet
, and
O.
Bondevik
.
Traumatic ulcers and pain in adults during orthodontic treatment.
Community Dent Oral Epidemiol
1987
.
15
:
104
107
.
5
Jones
,
M. L.
and
S.
Richmond
.
Initial tooth movement: force application and pain—a relationship?
Am J Orthod
1985
.
88
:
111
116
.
6
Pancherz
,
H.
and
M.
Anehus-Pancherz
.
The effect of continuous bite jumping with the Herbst appliance on the masticatory system: a functional analysis of treated Cl II malocclusions.
Eur J Orthod
1982
.
4
:
37
44
.
7
Bergius
,
M.
,
S.
Kiliaridis
, and
U.
Berggren
.
Pain in orthodontics.
J Orofac Orthop/Fortschr Kieferorthop
2000
.
61
:
125
137
.
8
Sergl
,
H. G.
,
U.
Klages
, and
A.
Zentner
.
Pain and discomfort during orthodontic treatment: causative factors and effects on compliance.
Am J Orthod Dentofacial Orthop
1998
.
114
:
684
691
.
9
Doll
,
G. M.
,
A.
Zentner
,
U.
Klages
, and
H. G.
Sergl
.
Relationship between patient discomfort, appliance acceptance and compliance in orthodontic therapy.
J Orofac Orthop/Fortschr Kieferorthop
2000
.
61
:
398
413
.
10
Bennett
,
M. E.
,
C.
Michaels
,
K.
O'Brien
,
R.
Weyant
,
C.
Philips
, and
K.
Dryland
.
Measuring beliefs about orthodontic treatment: a questionnaire approach.
J Public Health Dent
1997
.
57
:
215
223
.
11
Bos
,
A.
,
J.
Hoogstraten
, and
B.
Prahl-Andersen
.
Attitudes towards orthodontic treatment: a comparison of treated and untreated subjects.
Eur J Orthod
2005
.
27
:
148
154
.
12
Atchison
,
K. A.
,
E. E.
Black
, and
R.
Leathers
.
et al
.
A qualitative report of patient problems and postoperative instructions.
J Oral Maxillofac Surg
2005
.
63
:
449
456
.
13
Bennett
,
M. E.
,
J. F.
Tulloch
,
K. W.
Vig
, and
C. L.
Philips
.
Measuring orthodontic treatment satisfaction: questionnaire development and preliminary validation.
J Public Health Dent
2001
.
61
:
155
160
.
14
Streiner
,
D. L.
and
G. R.
Norman
.
Health Measurement Scales: A Practical Guide to Their Development and Use.
Oxford, UK: Oxford University Press; 2003:126–151
.
15
Portney
,
L. G.
and
M. P.
Watkins
.
Foundations of Clinical Research. Applications and Practice.
Norwalk, Conn: Appleton & Lange; 1993:509–516
.
16
Fox
,
R. N.
,
J. E.
Albino
,
L. J.
Green
,
S. D.
Farr
, and
L. A.
Tedesco
.
Development and validation of a measure of attitudes toward malocclusion.
J Dent Res
1982
.
61
:
1039
1043
.
17
Clemmer
,
E. J.
and
E. W.
Hayes
.
Patient cooperation in wearing orthodontic headgear.
Am J Orthod
1979
.
75
:
517
524
.
18
Stegenga
,
B.
,
L. G.
de Bont
, and
G.
Boering
.
Temporomandibular joint pain assessment.
J Orofac Pain
1993
.
7
:
23
37
.
19
Fleiss
,
J. L.
The Design and Analysis of Clinical Experiments.
New York, NY: Wiley & Sons; 1986:1–32
.
20
Landis
,
J. R.
and
G. G.
Koch
.
The measurement of observer agreement for categorical data.
Biometrics
1977
.
33
:
159
174
.
21
Streiner
,
D. L.
Starting at the beginning: an introduction to coefficient alpha and internal consistency.
J Pers Assess
2003
.
80
:
99
103
.
22
Caniklioglu
,
C.
and
Y.
Öztürk
.
Patient discomfort: a comparison between lingual and labial fixed appliances.
Angle Orthod
2004
.
75
:
86
91
.
23
Firestone
,
A. R.
,
P. A.
Scheurer
, and
W. B.
Bürgin
.
Patients' anticipation of pain and pain-related side-effects and their perception of pain as a result of orthodontic treatment with fixed appliances.
Eur J Orthod
1999
.
21
:
387
396
.
24
Scott
,
J.
,
B. M.
Ansell
, and
E. C.
Huskisson
.
The measurement of pain in juvenile chronic polyarthritis.
Ann Rheum Dis
1977
.
36
:
186
187
.
25
Marcusson
,
A.
,
T.
List
,
G.
Paulin
, and
I.
Akerlind
.
Reliability of a multidimensional questionnaire for adults with treated complete cleft lip and palate.
Scand J Plast Reconstr Hand Surg
2001
.
35
:
271
278
.
26
Wardle
,
J.
Psychological management of anxiety and pain during dental treatment.
J Psychosom Res
1983
.
27
:
399
402
.

Author notes

Corresponding author: Dr Ingalill Feldmann, Orthodontic Clinic, Box 57, SE-801 02 Gävle, Sweden([email protected])