## Abstract

**Objective:** To explore how many millimeters of tooth size discrepancy (TSD) are clinically significant, to determine what percentage of a representative orthodontic population has such a tooth size discrepancy, and to determine the ability of simple visual inspection to detect such a discrepancy.

**Materials and Methods:** The sample comprised 150 pretreatment study casts with fully erupted and complete permanent dentitions from first molar to first molar, which were selected randomly from 1100 consecutively treated white orthodontic patients. The mesiodistal diameter tooth sizes were measured using digital calipers, and the Bolton analysis and the tooth size corrections were calculated by the Hamilton Arch Tooth System (HATS) software. Simple visual estimation of Bolton discrepancy was also performed.

**Results:** In the sample group 17.4% had anterior tooth-width ratios and 5.4% had total arch ratios greater than 2 of Bolton's standard deviations from Bolton's mean. For the anterior analysis, correction greater than ± 2 mm was required for 16% of patients in the upper arch or 9% in the lower arch. For the total arch analysis, the corresponding figures are 28% and 24%.

**Conclusions:** It is recommended that 2 mm of required tooth size correction is an appropriate threshold for clinical significance. A significant percentage of patients have a TSD of this size. Visual estimation of TSD has low sensitivity and specificity. Careful measurement is more frequently required in clinical practice than visual estimation would suggest.

## INTRODUCTION

If a patient has a significant tooth size discrepancy (TSD) between the arches, orthodontic alignment of the teeth into ideal occlusion may not be possible.1 There have been several studies suggesting methods of defining and measuring tooth size discrepancy,2–5 but the best-known study of tooth size disharmony in relation to treatment of malocclusion was by Bolton6 in 1958. Bolton developed two ratios for estimating TSD by measuring the summed mesiodistal widths of the mandibular to the maxillary anterior teeth (anterior ratio) and the total width of all lower to upper teeth from first molar to first molar (overall or total-arch ratio).

The subjects in Bolton's original sample were chosen to have excellent occlusions, so all the cases had Bolton ratios, which by his definition did not prevent a good occlusion. The use of Bolton's standard deviations in a random sample of orthodontic patients may, therefore, overestimate the incidence of a clinically significant discrepancy in clinical practice. This would explain the high proportion of orthodontic patients in the studies in Table 1,7–10 with ratios beyond two of Bolton's standard deviations from Bolton's mean. In this table, Araujo and Souki11 also found a high proportion of patients with anterior tooth size discrepancies, but they defined a discrepancy as greater than ±1 standard deviation from Bolton's mean ratio. Bolton's mean ratio is likely to be a good guide to a ratio permitting a good occlusion, but his standard deviations of this ratio may be a poor indicator of a clinically significant TSD.

One way for clinicians to get a better feel for the clinical significance of a discrepancy is to focus more on the actual size of the discrepancy than on the Bolton ratios alone. This is well illustrated by the study by Bernabĕ et al,9 where their choice of an absolute amount of discrepancy in millimeters (1.5 mm) as an indicator of TSD showed that in their sample, the Bolton standard deviations of ratio were surprisingly, by that definition, a substantial underestimate of the incidence of significant discrepancy. Approximately 30% of the sample had more than 1.5 mm overall arch discrepancy, which is a much larger percentage than any overall arch prevalence in Table 1. This implies that those with overall TSD by Bolton's definition had an absolute discrepancy significantly greater than 1.5 mm.

Proffit1 suggested that a quick check for anterior tooth size discrepancy can be done by comparing the size of upper and lower lateral incisors. He proposed that unless the upper lateral incisors are larger, a discrepancy almost surely exists. For posterior tooth size discrepancy, he recommends that a quick visual check be done by comparing the size of upper and lower second premolars, which should be of approximately equal size.

The aims and objectives of the present study were to investigate:

How much tooth size discrepancy matters clinically in millimeters?

What percentage of a white orthodontic population have a tooth size discrepancy that matters?

Is simple visual estimation a good method for clinical use?

## MATERIALS AND METHODS

One hundred fifty pretreatment study casts were randomly selected from 1100 consecutively treated patients. Fifty-four patients were male and 96 were female, and the sample included a random selection of malocclusions.

The mesiodistal diameter tooth sizes were measured by one of the authors using digital calipers from first molar to first molar at the level of contact points. The calipers were connected to a computer running the Hamilton Arch Tooth System (HATS) software which calculates the Bolton analysis and recommends the tooth size correction. For comparison with the measurements, simple visual estimation of TSD was also carried out, in accordance with Proffit's1 suggestions.

### Statistical analyses

All the data were demonstrated to come from a normally distributed population. For the visual estimation, sensitivity and specificity tests were performed. Sensitivity is the ability of the test to correctly identify a tooth size discrepancy when it really is present—that is, the proportion of true positives. Specificity is the ability of the test to correctly identify the absence of a tooth size discrepancy when it is indeed not present— that is, the proportion of true negatives.12

The paired-sample *t*-test was used to evaluate the systematic error, and there was no statistically significant systematic error. Random error was determined by calculating the standard deviation of the differences of replicate measurements. In addition, the percentage of the total sample variance that consists of error variance (the variance of replicate measurements) was calculated. The standard deviation of replicate measurements was less than 1 mm or 1.5% for all measures; this was small in absolute terms, but a relatively high percentage of the total sample variance. While this level of random error is unlikely to mask significant results in a sample of this size, clinical decisions based on a single measurement should be undertaken with caution, and replicate measurements are advisable.

## RESULTS

Table 2 shows no significant sexual dimorphism for any of the parameters; hence, the sexes were combined for all other analyses.

Tables 3a and 3b compare the sample with Bolton's original sample. The mean ratios for the orthodontic patients of the present study were slightly higher than Bolton's value and had a larger range than his sample of excellent occlusions.

Figure 1a shows the distribution of anterior tooth-width ratios in this study, categorized by Bolton's original mean and standard deviations. This format assists comparison with some previous investigations and, in particular, shows the percentage of subjects falling more than 2 standard deviations from Bolton's mean. This format shows that 17.4% of the sample had anterior tooth-width ratios greater than 2 standard deviations from Bolton's mean (14.7% greater than +2 standard deviations and 2.7% greater than −2 standard deviations). This shift to the right compared with Bolton's results is also demonstrated in Figure 1a through the higher mean value in this study compared to Bolton's sample, ie, relatively more mandibular tooth width. Figure 1b shows the same data for the total arch ratio where the percentage falling more than 2 standard deviations from Bolton's mean was 5.4% of the sample, but was again shifted to the right, favoring high ratios.

Figures 2a and 2b show the percentages of subjects in terms of the upper and lower corrections in millimeters which would be required to give the mean ratio for Bolton's original sample. In these figures, a positive (+) sign on the X axis indicates that the correction to be done is to increase the tooth structure − relative tooth size deficiency, whereas the negative (−) sign indicates that the required correction is to reduce the tooth structure − relative tooth size excess. For both anterior and total arch correction, the white columns (required upper arch correction) are all taller on the positive side of the graph than the corresponding black columns (required lower arch correction) and vice versa on the negative side. This also indicates relative tooth size excess in the mandibular arch as a consistent feature. For the anterior analysis (Figure 2a), 32% of the sample needed upper correction more than ±1.5 mm and 16% needed upper correction more than ±2 mm while the corresponding figures for the lower arch were 17% and 9%. For the total arch analysis (Figure 2b), 42% of the sample needed upper correction more than ±1.5 mm and 28% needed upper correction more than ±2 mm while the corresponding figures for the lower arch were 36% and 24%.

The correlation between anterior and total tooth-width ratios (Table 4) was moderate (Pearson's correlation 0.69, *P* < .01). Figure 3 shows the scatterplot to visualize the distribution of the correlation data; 48% of the variation in the total ratio can be predicted from the anterior ratio as determined by *r*2 in a regression model.

Table 5 summarizes the sensitivity and specificity results of this study. The HATS results were taken as the best estimate of the true (“gold standard”) Bolton ratio, and TSD thresholds of 2 mm and 3 mm were chosen. The results were very similar for both thresholds. For both discrepancies (more than 2 mm and 3 mm), there was low sensitivity (42.9% for 2 mm and 43.8% for 3 mm) and higher specificity (74.1% for 2 mm and 70.9% for 3 mm).

## DISCUSSION

### Bolton's standard deviations of ratio as a measure of significant discrepancy

Many authors7–10 have considered a threshold of 2 standard deviations from Bolton's mean ratio in his original study to be a clinically significant Bolton discrepancy. While this makes statistical sense, we have argued that Bolton's sample was not well equipped to identify what discrepancy would give a significant occlusal imperfection because he chose them all to have a good occlusion to be part of his sample. In a normally distributed population, 5% of subjects would fall more than 2 standard deviations from the mean. The present study found 17.4% of the sample had anterior tooth-width ratios greater than 2 of Bolton's standard deviations from Bolton's mean (Figure 1a) and that an orthodontic population is skewed in relation to Bolton's mean figure. This result supports others,7–10 in Table 1, which have unsurprisingly found that a population of orthodontic patients has a higher percentage of outliers than Bolton's sample by this definition.

Interestingly, the present sample found almost exactly 5% (5.4%) of the sample had total arch ratios greater than 2 standard deviations from Bolton's mean. Table 1 contains the percentage of significant discrepancies by this definition found in various studies. It is clear that all studies have found a lower percentage of cases falling outside Bolton's standard deviations for the total arch ratio than for the anterior ratio.

An important source of variation in results for these studies may, of course, be variations in the composition and selection of the samples. With regard to gender and race, a systematic literature review13 concluded that the small but statistically significant differences in Bolton ratios sometimes found between different racial groups and genders are of a dimension unlikely to be of a clinically significant size, although there may be significantly higher ratios in Class III patients. The current study found no gender differences, was of an entirely white racial group, and was randomly selected and thus proportionately representative of malocclusion type.

It is relevant to mention the well-known effect of premolar extractions on the ideal Bolton ratios. This effect was recognized and quantified by Bolton14 and much more recently by Kayalioglu et al15 and is the consequence of the effect on a ratio of reducing the absolute sums of the tooth widths in the same way that the ratio is different for the total arch and the smaller anterior arch segment. Because lower second premolars are, on average, slightly larger than upper premolars, a study by Tong et al16 examined the effects on Bolton ratio of hypothetical combinations of premolar extractions in a given case. They found that with combinations of extraction involving lower second premolars, some high overall ratios could become within normal limits after removal of premolars. This effect was modest, but nevertheless raises the question as to which ratio should be considered ideal in a pretreatment malocclusion. This question does not, of course, affect the anterior Bolton ratio.

In the current study, looking at the total arch ratios in pretreatment malocclusions, a complicating factor is that the extraction rate would presently vary widely for the same group of malocclusions when planned by different clinicians with differing treatment philosophies. The decision was taken to report the ratios for these pretreatment malocclusions assuming they were all nonextraction. A further factor in this decision was the appreciation that the TSD is better expressed in millimeters than in terms of discrepancy in ratios. It can be estimated that the prevalence of aberrant total arch ratios would be slightly smaller in this sample if some cases were treated with extraction of upper premolars and lower second premolars.

### Discrepancy in millimeters as a measure of clinical significance

In clinical practice, any correction for TSD may be based on the ratio in percentage terms, but is carried out in absolute millimeters of change in tooth widths. Proffit1 stated that tooth size discrepancies less than 1.5 mm are rarely significant. Taking this level as a significant discrepancy, the current study (Figure 2b) revealed 42% of patients requiring correction for the total arch ratio through upper arch adjustment, or 36% if the lower arch is adjusted. It is worth recalling—as Bernabĕ et al9 discussed very clearly—that a higher percentage of patients will require correction to a given ratio if the adjustment is in the upper arch. This arises from the larger total tooth width in the upper arch, which therefore requires more millimeters of change to achieve a given percentage change. However, a figure of 1.5 mm is only an occlusal discrepancy of 0.75 mm per side and this may be considered too small a potential occlusal error to be clinically significant. If 2 mm is taken as an appropriate threshold, 28% (upper arch) or 24% (lower arch) of this population are still in need of occlusal adjustment to permit an ideal occlusion. For the anterior discrepancy data (Figure 2a), the corresponding figures are 32% (upper arch correction) or 17% (lower arch correction) for a 1.5-mm threshold, and 16% or 9% for a 2-mm threshold. When compared to the Bolton standard deviation definition of TSD, this millimetric way of expressing a threshold of significant discrepancy, therefore, has the interesting effect that the percentage of cases deemed to have a significant problem is very much higher for the total arch analysis and yet similar for the anterior analysis, particularly if the 2-mm level is chosen and if the alteration is planned for the lower arch. Whereas Table 4 shows that the use of Bolton's standard deviations gives the result that there is a greater prevalence of anterior discrepancy than total arch discrepancy, the use of a millimetric definition of significant discrepancy produces the reverse relationship, ie, a higher incidence of total arch discrepancy.

Bernabĕ et al9 concluded that the prevalence of TSD in any sample will be different based on the method of expressing the discrepancy and the arch chosen for correction and that this could have significant clinical implications. The current study strongly supports this conclusion.

### Visual estimation of TSD

Table 5 reveals that simple “eyeballing” of study models is a poor method of assessing TSD in a representative sample of orthodontic patients. It might have been anticipated that the “eyeball” method would be significantly better at detecting the 3-mm discrepancy subjects, but the eye was equally unreliable at both levels of discrepancy. Simple visual judgment missed more than half of the subjects with a significant discrepancy. In this sample of 150 orthodontic patients, a visual estimation would have missed 11 of the approximately 19 people who had an anterior discrepancy >2 mm and 23 of the approximately 39 people with a total discrepancy >2 mm. As explained above, the percentage of people having a discrepancy >2 mm differs, depending on which arch is chosen for adjustment. Better specificity was observed, but approximately 30% of those estimated not to have a discrepancy did have a significant discrepancy. It can be concluded that the ability of visual judgment to detect a lack of Bolton discrepancy is higher than the ability to detect a significant Bolton discrepancy, but that this method is highly unreliable.

### Experimental investigation of the threshold value for significance

Perhaps the only experimental work directly addressing this question was an intriguing typodont study,17 but the authors' conclusion that 12 mm of tooth size discrepancy is not of occlusal significance must say more about the insensitivity of some aspects of the weighted Peer Assessment Rating (PAR) score, which was used as the measure of satisfactory occlusion, than about a sensible threshold amount of discrepancy to cause a significant occlusal imperfection. What is sought is a guide to the size of discrepancy that will not permit a good occlusion in spite of the best possible orthodontic correction of tooth alignment and relations. Perhaps a useful approach is to conduct a typodont-based study similar to that by Heusdens et al17 but use quantified peer opinion, as opposed to the PAR score, as the arbiter of an unsatisfactory occlusion.

### Thresholds for anterior and total arch discrepancies

An interesting point for debate is whether the same threshold in millimeters should be chosen for the anterior and total arch ratios. A 2-mm discrepancy is a larger percentage of the anterior arch than of the total arch. If each canine errs from a perfect relationship by 1 mm, is this an equivalent or worse occlusal error than 1 mm spread over each side of the whole arch? The question is further complicated by the possibility that the anterior discrepancy could be 2 mm, but the total discrepancy zero.

By definition, Bolton found 5% of subjects to have significant TSD for both anterior and total-arch ratios. All subsequent studies employing the same definition have found much higher percentages of anterior TSD than total-arch TSD. Combining this fact with the higher mean ratios in this study suggests that malocclusions tend to have anterior mandibular tooth excess, but a smaller degree of posterior mandibular excess. The view was taken in this investigation that the same millimetric threshold can be appropriately applied to both anterior and total arch analyses and to total arch analyses of both extraction and nonextraction cases, but this is ultimately a partially philosophical choice. Two millimeters (or 1 mm per side) seems a reasonable minimum for intervention to change the size of teeth on the grounds of occlusal fit. It should be remembered that if an adjustment is required in the anterior segment, then the overall TSD in millimeters will be affected by the same amount, whereas posterior adjustment will leave anterior TSD unchanged. On this basis, this study revealed that 28% or 24% of a representative sample of white patients from the UK have a total arch discrepancy of potential significance and 16% or 9% have a significant anterior arch discrepancy.

## CONCLUSIONS

There was more relative tooth size excess in the mandibular arch in a representative sample of malocclusions compared to Bolton's original sample of excellent occlusions.

Of the sample, 17.4% had anterior ratios and 5.4% had total tooth-width ratios greater than 2 standard deviations from Bolton's mean.

Tooth size discrepancies are better expressed in terms of the millimeters required for correction. A threshold of 2 millimeters is recommended.

In a representative UK orthodontic population, 9% of patients needed anterior lower correction of more than ±2 mm, whereas for the total arch, 24% needed lower correction. For the upper arch correction, the corresponding figures are 16% and 28%. Significant TSD occurs in a significant proportion of patients.

Simple visual inspection is a poor method of detecting TSD. Careful and more frequent measurements are required in clinical practice.

## Acknowledgments

We would like to express our thanks to Rosemary Greenwood for her guidance throughout the statistical analysis process.

## REFERENCES

## Author notes

Corresponding author: Dr Nigel Harradine, Bristol Dental Health, Child Dental Health, Lower Maudlin Street, Bristol, UK ([email protected])