To evaluate the accuracy of Invisalign technology in achieving predicted tooth positions with respect to tooth type and direction of tooth movement.
The posttreatment models of 30 patients who had nonextraction Invisalign treatment were digitally superimposed on their corresponding virtual treatment plan models using best-fit surface-based registration. The differences between actual treatment outcome and predicted outcome were computed and tested for statistical significance for each tooth type in mesial-distal, facial-lingual, and occlusal-gingival directions, as well as for tip, torque, and rotation. Differences larger than 0.5 mm for linear measurements and 2° for angular measurements were considered clinically relevant.
Statistically significant differences (P < .05) between predicted and achieved tooth positions were found for all teeth except maxillary lateral incisors, canines, and first premolars. In general, anterior teeth were positioned more occlusally than predicted, rotation of rounded teeth was incomplete, and movement of posterior teeth in all dimensions was not fully achieved. However, except for excess posttreatment facial crown torque of maxillary second molars, these differences were not large enough to be clinically relevant.
Although Invisalign is generally able to achieve predicted tooth positions with high accuracy in nonextraction cases, some of the actual outcomes may differ from the predicted outcomes. Knowledge of dimensions in which the final tooth position is less consistent with the predicted position enables clinicians to build necessary compensations into the virtual treatment plan.
In the past two decades, the field of orthodontics has been revolutionized by technological advancements. Three-dimensional imaging has expanded diagnostic and treatment planning abilities,1 intraoral scanners now provide an alternative to traditional impressions, and digital models can replace plaster models for both treatment planning and appliance fabrication.2,3 Combined with increasing patient demand for esthetic treatment options and the drive toward personalized treatment, these developments have given rise to a number of clear aligner systems now serving as alternatives to conventional bracket-and-archwire systems.4
In 1999, Align Technology (Santa Clara, Calif) introduced Invisalign as the pioneer clear aligner system for comprehensive orthodontic treatment. Invisalign has continually evolved through the development of new aligner materials, attachments on teeth, staging of tooth movement, and incorporation of interproximal reduction and interarch elastics to address a wider range of malocclusions.5,6
According to the company's internal data, more than 3 million patients have been treated with the Invisalign system in more than 90 countries worldwide. Despite its widespread use, relatively few studies have quantified the system's efficacy. This is significant because it has been suggested that aligners have limitations when it comes to producing certain tooth movements.5 For instance, questions have been raised regarding the extent to which aligners can control extrusion, rotation, bodily movement, and torque.5 Some authors even doubt that bodily movements or torque can be accomplished at all by aligners.7 Therefore, the aim of this study was to evaluate the efficacy of Invisalign technology to achieve predicted tooth positions with respect to tooth type and direction of movement.
MATERIALS AND METHODS
Approval for this retrospective cohort study was granted by the institutional review board at the University of Minnesota (Study 1411M56201). A total of 30 consecutive patients (13 male, 17 female; age 21.6 ± 9.8 years) were selected based on the following inclusion criteria: full permanent dentition including second molars in both arches, nonextraction Invisalign treatment with no deviation from the default amounts of tooth movement embedded in each aligner stage, aligners changed every 2 weeks following the manufacturer's protocol, no midcourse corrections or additional aligners, and no combined treatment with fixed appliances, intraoral distalizers, or other auxiliary appliances. Patients were excluded if they required oral surgery or received dental restorations during treatment. Treatment was provided by 12 orthodontic residents and 10 orthodontists certified in the use of the Invisalign system. The residents provided care under the supervision of the orthodontists, and each virtual treatment plan was reviewed and approved by an orthodontist to ensure that no unrealistic goals were set. The average treatment time was 11 ± 4 months. Of the 30 patients, 22 had class I molar occlusions, 7 had class II molar occlusions, and 1 had a class III molar occlusion (all less than 2 mm). The average amount of crowding in each arch was 2 ± 2 mm. Interproximal enamel reduction was performed as prescribed in each patient's virtual treatment plan.
To obtain posttreatment digital models, all patients had alginate impressions taken and poured into plaster casts, which were then digitized using an R700 orthodontic model scanner (3Shape A/S, Copenhagen, Denmark). To obtain digital models of the predicted outcome (virtual treatment plan models), the final stage of each patient's virtual treatment plan was exported through Align Technology's ClinCheck program and converted to digital models using e-model 9.0 software (GeoDigm Corporation, Falcon Heights, Minn). All digital models were de-identified, and soft tissue and bonded retainers were digitally removed to ensure that evaluation was based solely on tooth-surface features. The posttreatment models were segmented to isolate each tooth as a separate object and compared with the unsegmented virtual treatment plan models using e-model Compare 8.1 software (GeoDigm). This software compares individual tooth positions between two digital models. Corresponding dental arches are first aligned globally, and then individual teeth from a segmented model are superimposed on analogous teeth of an unsegmented model using a best-fit algorithm so that differences between tooth positions can be computed (Figure 1).
For global alignment, the mesial-buccal cusps of the first molars and the mesial-incisal point of the right central incisor in each arch were used as matching points for initial registration. This initial registration was then refined by 50 iterations of a closest-point algorithm to achieve best fit of the occlusal surfaces. After that, a single operator placed a reference coordinate system with the origin of the axes at each tooth's approximate center of resistance for each tooth of the posttreatment model. Because the center of resistance depends on many local factors such as tooth morphology, root length, attachment levels, and the direction of force application,8,9 which could not be determined for each individual tooth, the coordinates were placed at a point in the center of each tooth, 8-mm apical to the cemento-enamel junction. This was based on the assumption that the center of resistance was situated halfway between the alveolar crest and the root apex, the average root length of all tooth types, and a biologic width of 2 mm.10–12 Once the axes were placed in the posttreatment model, the software automatically generated analogous axes for each corresponding tooth in the virtual treatment model. The software then superimposed individual teeth from the segmented posttreatment model on the corresponding teeth in the unsegmented virtual treatment model using best-fit surface-based registration. Based on the transformation of axes required to fit each tooth, the software quantified the differences between achieved and predicted position for each tooth in the following six directions: mesial-distal, facial-lingual, occlusal-gingival, tip, torque, and rotation. The differences were expressed with respect to the center of resistance.
Because the software allowed for the detection of differences that were too small to be clinically relevant, threshold values were chosen in reference to the American Board of Orthodontics (ABO) model grading system for case evaluation.13 According to the model grading system criteria, discrepancies of 0.5 mm or greater in the alignment of contact points and marginal ridges will result in the deduction of points. A marginal ridge discrepancy of 0.5 mm equates to a crown-tip deviation of 2° for an average-sized molar. Therefore, differences of 0.5 mm or more in the mesial-distal, facial-lingual, and occlusal-gingival directions and differences of 2° or more in tip, torque, and rotation were considered clinically relevant.
Data from each patient's left and right analogous teeth were pooled for analysis after a linear mixed model had been used to verify that there were no significant side differences. Descriptive statistics were computed for the differences between predicted and achieved tooth positions in each of the six directions. A linear mixed model was used to calculate the corresponding 95% confidence interval for each mean difference. To assess whether the differences were statistically significant, P values were calculated using a false discovery rate method to adjust for the multiple comparisons performed.
Equivalence testing was used to assess whether the differences between predicted and achieved tooth positions were large enough to be clinically relevant. Two one-sided t-tests were used to test for differences above 0.5 mm and below −0.5 mm, and above 2° and below −2°. Mean differences that fell within −0.5 mm to +0.5 mm for linear measurements and within −2° and +2° for angular measurements were practically equivalent and therefore considered too small to be clinically relevant.
A post-hoc power analysis was performed based on 30 independent samples to estimate the power of the study to detect differences that were small enough to fall within the equivalent region between −0.5 mm to +0.5 mm or −2° to +2°.
Because the calculated differences between predicted and achieved tooth positions included both positive and negative values, an additional analysis was performed on the absolute values of the mean differences to eliminate the possibility of positive and negative values averaging in a mean close to zero and giving the false impression of clinical accuracy. For this, the data based on absolute values were log transformed to normalize the distributions, and a one-sided test of equivalence was applied to the log transformed values.
Placement of coordinates and superimposition were repeated for 10 randomly selected patients, and Pearson correlation coefficients and Bland-Altman analyses were used to assess intraoperator agreement for each of the six directions of tooth movement. Statistical analyses were performed with SAS 9.4 for Windows (SAS Institute Inc, Cary, N.C.), and P values of less than 0.05 were considered statistically significant.
The power to detect differences small enough to fall below the thresholds set for clinical relevance in the various directions is reported in Table 3. Because the power calculations were based on 30 patients as independent samples, the actual power is even higher because each of the 30 patients had 28 teeth.
The differences between predicted and achieved tooth positions are reported in Table 4. In the maxillary arch, several tooth types showed statistically significant differences between predicted and achieved tooth positions: The central incisor was positioned more facially and occlusally and had more lingual crown torque. The second premolar was more distal and lingual relative to the predicted position and had more facial crown torque. The first molar was aberrant in these same three directions and had more mesial crown tip than predicted. The second molar had more facial crown torque compared to the prediction and was positioned more lingually and occlusally. This difference in maxillary second molar torque between the predicted and achieved position exceeded 2° and was therefore considered clinically relevant.
In the mandibular arch, all tooth types showed statistically significant differences. Both central and lateral incisors were positioned more occlusally than predicted. The lateral incisor also had more mesial rotation. The canines were more lingual and had more facial crown torque and distal rotation than predicted. Both first and second premolars had more mesial rotation than predicted. Finally, both the first and second molars had more facial crown torque than predicted. The second molar also had more distal crown tip. Although statistically significant, none of these differences in the mandibular arch were considered clinically relevant.
Statistical analysis performed on the absolute value of each discrepancy measurement did not reveal any additional differences that had not been detected previously.
Using mathematical superimposition of digital models, it has become possible to quantify treatment changes and, as in the present study, discrepancies between virtual treatment plans and actual treatment outcomes. Previous studies that evaluated the effectiveness of clear aligner therapy used the ABO model grading system,14–16 Tooth Measure (Align Technology)17–20 or Surfacer (Imageware, Plano, Tex) software.6 Although these tools can provide a general assessment of accuracy, the software used in the current study was uniquely able to quantify differences between objects with respect to six degrees of freedom.21 The software calculates differences automatically, that is, not influenced by potential operator bias, and has been previously used in other projects directed at quality control and outcomes assessment.22–24 However, the multistep process used to create the digital models constituted a potential source of error. The dimensional accuracy of the plaster casts from which the digital models were derived may have been affected by possible distortion or shrinkage of the impression material, and the accuracy of the digitization may have been limited by the resolution of the model scanner (20 μm).
Predicted and achieved tooth positions differed in all tooth types. Anterior teeth were often positioned too far occlusally, rounded teeth such as mandibular canines and premolars were not fully rotated, and posterior teeth had discrepancies in all directions. The largest difference was found for maxillary second molar torque, which exceeded 2° and was therefore considered clinically relevant. This difference in facial-lingual inclination of second molars has also been described following treatment with traditional fixed appliances.25 This may be related to the decreasing amount of force exerted by the end of an archwire as interbracket distance and flexibility of the wire increase. Moreover, molars have larger root surface areas and require greater forces for tooth movement.10 The same concept may apply for clear aligner therapy; there seems to be greater flexibility and less force exerted by the posterior segments of aligners.
In addition, maxillary posterior teeth were positioned more lingual with more facial crown torque than predicted. It is likely that maxillary arch expansion was not fully achieved and the molars tipped rather than moved bodily during the process, both of which could have resulted from flexing of the aligners. This notion is supported by a recent study that found the mean accuracy of maxillary expansion with Invisalign to be 72.8% with more tipping observed than predicted in the virtual treatment plan.26 The mandibular molars also had more facial crown torque than predicted. This, too, could be the consequence of an inability of the aligners to fully express the torque specified in the virtual treatment plan and may have been compounded by biological limitations such as proximity of the molar roots to the cortical plate of the mandible.
With regard to incisors, the results of the current study resemble those of others that found movements of anterior teeth to have relatively poor accuracy. For instance, both Kravitz et al.17 and Krieger et al.18 reported deficiencies in vertical incisor movement with only 44% to 46% of the predicted intrusion achieved for central incisors. Thus, significant correction of a deep overbite with Invisalign appears difficult. Similarly, the difference in maxillary central incisor torque found in the current sample was consistent with other studies that observed tipping of incisors rather than bodily movement.19,27 Possible reasons for the more upright positions include potential torque loss during space closure or the use of class II interarch elastics. Surprisingly, no significant differences in the occlusal-gingival position or torque of the maxillary lateral incisors were found in the patients in this study. Anecdotally, these teeth are often challenging to treat and, because of their shape, sometimes require the use of auxiliaries such as bonded buttons combined with intraoral elastics.17
Interestingly, there were statistically significant differences in mandibular canine facial-lingual position, torque, and rotation, but not tip. Although the discrepancies in mandibular canine torque and rotation in the current data coincided with reports of minimal torque changes and low accuracy of canine rotation produced by clear aligners previously,28,29 the minimal discrepancies in angulation differed from studies that found low accuracy of mesiodistal tipping of mandibular canines.17 In contrast to mandibular canine position, maxillary canine position did not differ significantly. This is of interest because the accuracy of rotation and tip of maxillary canines has been described as notoriously low,17,28 likely because these teeth have the longest roots in the dentition with large root surface areas, requiring greater force to produce orthodontic tooth movement.10
Although the present results suggest that the intraarch tooth position predicted by the virtual treatment plan is not consistently achieved by Invisalign aligners and some limitations in the appliance system remain, they in no way suggest unsatisfactory treatment results. In fact, the difference in maxillary second molar torque between predicted and achieved position was the only difference considered clinically relevant. When planning clear aligner therapy, clinicians may utilize Align Technology's ClinCheck program to design their biomechanics rather than merely as a tool for visualization of predicted treatment outcomes. Knowledge of the dimensions in which the final tooth position is less consistent with the predicted position allows them to pay extra attention to more challenging movements, modify attachments, and build specific overcorrections into their virtual treatment plans to increase efficiency and achieve better treatment outcomes.
In general, Invisalign is able to achieve predicted tooth positions with high accuracy in nonextraction cases. Clinicians may consider the following when planning treatment with Invisalign:
Maxillary arch expansion may not be fully achieved.
Mandibular incisors tend to be positioned more occlusally than predicted.
Rotation of rounded teeth may be incomplete.
Molar torque may not be fully achieved, with maxillary second molars often having a clinically relevant magnitude of more facial crown torque than predicted.