Background

There are more than ten classification systems currently used in the staging of hallux rigidus. This results in confusion and inconsistency with radiographic interpretation and treatment. The reliability of hallux rigidus classification systems has not yet been tested. We sought to evaluate the intraobserver and interobserver reliabilities of three commonly used classifications for hallux rigidus.

Methods

Twenty-one plain radiograph sets were presented to ten American College of Foot and Ankle Surgeons board-certified foot and ankle surgeons. Each physician classified each radiograph based on clinical experience and knowledge according to the Regnauld, Roukis, and Hattrup and Johnson classification systems. The two-way mixed single-measure consistency intraclass correlation coefficient was used to calculate intrarater and interrater reliabilities.

Results

The mean ± SD intrarater reliability of individual sets for the Roukis (0.62 ± 0.19) and Hattrup and Johnson (0.62 ± 0.28) classification systems was fair to good and for the Regnauld system bordered between fair to good and poor (0.43 ± 0.24). The interrater reliability of the mean classification was excellent for all three classification systems.

Conclusions

Reliable and reproducible classification systems are essential for treatment and prognostic implications in hallux rigidus. Herein, the Roukis classification system had the best intrarater reliability. Although there are various classification systems for hallux rigidus, the present results indicate that the three classification systems evaluated show reliability and reproducibility.

Hallux limitus and hallux rigidus are two of the most common pathologic disorders facing podiatric physicians, affecting 2.5% of the adult population.1  Hallux limitus is an arthritic condition causing decreased motion at the first metatarsophalangeal joint, which tends to be more frequent in men and with 44% of affected patients older than 80 years. The decrease in sagittal plane motion leads to pain, stiffness, and gait disturbances.2,3  The condition is progressive and oftentimes results in hallux rigidus, a complete lack of motion at the first metatarsophalangeal joint. Although there does not seem to be one underlying cause of hallux limitus, it can be thought to result from hypermobility of the first ray, pes planus, a long first metatarsal, primus elevatus, iatrogenous complications, inflammatory diseases such as gout and rheumatoid arthritis, and genetics.2,4,5 

Hallux rigidus and hallux limitus can be diagnosed with radiographs and clinical evaluation.2,6,7  The degree of joint space narrowing and the presence of osteophytes and first metatarsal elevation help determine the severity of the disorder. There have been numerous radiographic classification systems to help measure and grade hallux rigidus.8-14  The three most commonly used classifications are the Regnauld, Hattrup and Johnson, and Roukis classification systems. The Regnauld classification has three grades, with grade I defined as functional hallux limitus, grade II as joint adaptation with flattening of the first metatarsal head and pain at end range of motion, and grade III as arthrosis with severe flattening of the first metatarsal head, osteophytes, asymmetrical joint space narrowing, and erosions.8,11  The Hattrup and Johnson classification also has three grades: grade I is characterized as mild-to-moderate formation of osteophytes with no joint space involvement; grade II as moderate osteophyte formation, joint space narrowing, and subchondral sclerosis; and grade III as increased osteophyte formation and loss of joint space.9  Roukis was the first grading system applied prospectively and is similar to the Regnauld classification but includes a stage IV, which is defined as having less than 10° of range of motion and loose bodies with obliteration of joint space in the first metatarsophalangeal joint.10 

Surgeons rely heavily on radiographs and classification systems to confirm clinical diagnosis and aid in surgical planning.6  The objective of this study was to assess the interobserver and intraobserver reliability of the Regnauld, Hattrup and Johnson, and Roukis classification systems of hallux rigidus. In particular, we examine the consistency of three commonly used classification systems and discuss implications for the treatment of hallux rigidus.

Materials and Methods

We selected the Regnauld, Roukis, and Hattrup and Johnson hallux rigidus classification systems for study (Table 1). Twenty-one plain radiograph sets (three radiograph packets with each packet containing seven sets of radiographs) of the foot were randomly selected from physicians' electronic systems based on the presence of hallux limitus deformity. Three standard views were selected and provided per set (weightbearing anteroposterior, medial oblique, and lateral) (Fig. 1). Each radiograph packet contained the same seven sets of radiographs with one of the three classification systems attached.

Table 1

Comparison of the Three Hallux Rigidus Classification Systems Used in the Study10-12 

Comparison of the Three Hallux Rigidus Classification Systems Used in the Study10-12
Comparison of the Three Hallux Rigidus Classification Systems Used in the Study10-12
Figure 1

Examples of radiographs used in this study: weightbearing anteroposterior (A) and lateral/medial oblique (B) views of the left foot.

Figure 1

Examples of radiographs used in this study: weightbearing anteroposterior (A) and lateral/medial oblique (B) views of the left foot.

Ten American College of Foot and Ankle Surgeons (ACFAS) board-certified foot and ankle surgeons were instructed to objectively classify each packet of radiographs according to its assigned classification system based on their clinical experience and knowledge. The surgeons were randomly selected. A handout of the instructions was given to each physician. This process was repeated with each individual physician three times to determine whether each physician classified each set of radiographs consistently and to eliminate potential bias. All of the surgeons were familiar with the Regnauld, Roukis, and Hattrup and Johnson classification systems (Table 1).

Statistics

Physician intrarater and interrater reliabilities were assessed based on the two-way mixed single-measure consistency intraclass correlation coefficient (ICC).14,15  The ICC provides a measure of the proportion of reliable variance and typically ranges between 0 and 1.15  Specifically, ICC[3,1] and ICC[3,k] are reported. The ICC was interpreted based on guidelines for reliability provided by Fleiss16 : less than 0.40, poor reliability; 0.40 to 0.75, fair to good reliability; and greater than 0.75, excellent reliability. Analyses were conducted using R version 3.1.3 (Vienna, Austria).12-14 

Results

The 21 radiograph sets were assessed by the ten ACFAS board-certified foot and ankle surgeons: nine men and one woman with 6 to 7 years of postgraduate education and 5 to more than 20 years of experience. Overall, the mean ± SD intrarater reliability of individual sets for the Roukis and Hattrup and Johnson classification systems was fair to good (0.62 ± 0.19 and 0.62 ± 0.28, respectively), whereas that for the Regnauld system bordered between fair to good and poor (0.43 ± 0.24). The mean ± SD intrarater reliability of the mean classification across the three sets was excellent for the Roukis and Hattrup and Johnson classification systems (0.81 ± 0.13 and 0.78 ± 0.23, respectively) and fair to good for the Regnauld system (0.65 ± 0.24). Intrarater and interrater reliabilities are shown in Table 2.

Table 2

Physician Intrarater and Interrater Reliabilities for the Three Studied Classification Systems for Hallux Rigidus

Physician Intrarater and Interrater Reliabilities for the Three Studied Classification Systems for Hallux Rigidus
Physician Intrarater and Interrater Reliabilities for the Three Studied Classification Systems for Hallux Rigidus

Interrater reliability was lower for set 3 than for sets 1 and 2 for all classification systems (Table 2). The Roukis classification system attained consistently higher interrater reliability than the Regnauld or Hattrup and Johnson classification systems across sets 1, 2, and 3. Specifically, interrater reliability of individual physicians for the Roukis classification system was fair to good, whereas that for the Regnauld and Hattrup and Johnson classification systems was fair to good for sets 1 and 2 and poor for set 3. Reasons for the possible outliers in set 3 could be attributed to the subjective reading as well as variability in radiographic views and the different professional backgrounds and experience of the physicians. Interrater reliability of the mean classification across the ten physicians was excellent for all three classification systems, with the highest reliability attained by the Roukis classification system.

Discussion

We evaluated the intrarater and interrater reliability of three commonly used hallux rigidus classification systems in evaluating foot radiographs in a sample of ten ACFAS board-certified surgeons. In doing so, we evaluated whether particular classification systems may exhibit greater consistency for the grading of hallux rigidus. The present results show consistency and little variability in the Regnauld, Roukis, and Hattrup and Johnson classification systems.

Hallux rigidus of the first metatarsophalangeal joint is one of the most common conditions associated with the foot. A multitude of classification systems for hallux rigidus have been used since 1930.17  Previous research has discussed the relative faults and strengths of the various classification systems, including reliance solely on radiographic findings without accounting for clinical findings.17  To our knowledge, we found only one other study, by Pate et al,18  that compares the reliability and intraobserver agreement of these classification systems. They suggested using radiographic grading systems for hallux rigidus with caution because they had only 75% intraobserver reliability. In the present study, we found that the Roukis and Hattrup and Johnson classification systems obtain higher intrarater reliability than the Regnauld system. With respect to interrater reliability, the Roukis classification system attains higher interrater reliability than the Regnauld and Hattrup and Johnson classification systems.

The classification systems evaluated herein are useful for grading the severity of hallux rigidus and for directing treatment or evaluating prognosis. For example, a Roukis grade I would lead to either conservative treatment or a cheilectomy, Watermann, or Youngswick-Austin procedure, and a Roukis grade IV would lead more to an implant versus arthrodesis.12,19  Conservative treatments for hallux rigidus include icing, nonsteroidal therapy, shoe modification, corticosteroid injections, physical therapy, and orthotic devices.18  The goal for surgical treatment is to relieve pain and ensure mobility of the first metatarsophalangeal joint. The present study offers the first quantitative comparison of the relative strengths of the classification systems based on interrater and intrarater reliability. This information can be useful in determining the plan and treatment that an individual needs based on the level of severity.

Limitations of this study include the relatively small number of observers and radiographs used. Future research using larger sample sizes may be useful to validate the conclusions identified herein. In addition, in the present study we focus on the Roukis, Regnauld, and Hattrup and Johnson classification systems. These particular systems were chosen due to their widespread use. However, other classification systems, such as Hanft, Coughlin, Drag, Oloff, and Jacobs, may be useful to be evaluated in context.20,21 

The present data show consistent findings of reliability and reproducibility using the three classification systems, with the Roukis classification having the best intrarater reliability. Currently, there is no gold standard among hallux rigidus classification systems, resulting in difficulty comparing results from different classification systems from a physician perspective.6,21  We conclude that the classification system for hallux rigidus is a reliable and reproducible tool in evaluating radiographs and grading hallux rigidus. Larger randomized prospective studies may be useful to evaluate the conclusions observed herein.

Financial Disclosure: None reported.

Conflict of Interest: None reported.

References

References
1. 
Keiserman
LS,
Sammarco
VJ,
Sammarco
GJ:
Surgical treatment of hallux rigidus
.
Foot Ankle Clin
10
:
75
,
2005
.
2. 
Coughlin
MJ,
Shurnas
PS:
Hallux rigidus: demographics, etiology, and radiographic assessment
.
Foot Ankle Int
24
:
731
,
2003
.
3. 
Botek
G,
Anderson
MA:
Etiology, pathophysiology and staging of hallux rigidus
.
Clin Podiatr Med Surg
28
:
229
,
2011
.
4. 
Yee
G,
Lau
J:
Current concepts review: hallux rigidus
.
Foot Ankle Int
29
:
392
,
2008
.
5. 
Chang
TJ:
Stepwise approach to hallux limitus: a surgical perspective
.
Clin Podiatr Med Surg
13
:
449
,
1996
.
6. 
Zgonis
T,
Jolly
GP,
Garbalosa
JC:
The value of radiographic parameters in the surgical treatment of hallux rigidus
.
J Foot Ankle Surg
44
:
184
,
2005
.
7. 
Shereff
MJ,
Baumhauer
JF:
Hallux rigidus and osteoarthritis of first metatarsophalangeal joint
.
J Bone Joint Surg Am
80
:
898
,
1998
.
8. 
Vanore
JV,
Christensen
JC,
Kravitz
SR,
et al:
Diagnosis and treatment of first metatarsophalangeal joint disorders, section 1: hallux rigidus
.
J Foot Ankle Surg
42
:
112
,
2003
.
9. 
Drago
JJ,
Oloff
L,
Jacobs
AM:
A comprehensive review of hallux limitus
.
J Foot Surg
23
:
213
,
1984
.
10. 
Hattrup
SJ,
Johnson
KA:
Subjective results of hallux rigidus following treatment with cheilectomy
.
Clin Orthop Relat Res
226
:
182
,
1988
.
11. 
Regnauld
B:
The Foot: Pathology, Aetiology, Seminology, Clinical Investigation and Therapy
,
Springer-Verlag
,
New York
,
1986
.
12. 
Roukis
TS,
Jacobs
M,
Dawson
DM,
et al:
A prospective comparison of clinical radiographic, and intraoperative features of hallux rigidus
.
J Foot Ankle Surg
41
:
76
,
2002
.
13. 
Coughlin
MJ,
Shurnas
PS:
Hallux rigidus: grading and long-term results of operative treatment
.
J Bone Joint Surg Am
85
:
2072
,
2003
.
14. 
Norman
GR,
Streiner
DL:
Biostatistics: The Bare Essentials
,
BC Decker
,
Hamilton, ON, Canada
,
2008
.
15. 
Shrout
PE,
Fleiss
JL:
Intraclass correlations: uses in assessing rater reliability
.
Psychol Bull
86
:
420
,
1979
.
16. 
Fleiss
J:
The Design and Analysis of Clinical Experiments
,
John Wiley & Sons Inc
,
New York
,
1986
.
17. 
Beeson
P,
Phillips
C,
Coor
S:
Classification system for hallux rigidus: a review of the literature
.
Foot Ankle Int
29
:
407
,
2008
.
18. 
Pate
RC,
Fanning
JW,
Shields
NN,
et al:
Reliability of hallux rigidus radiographic grading system
.
Kansas J Med
8
:
125
,
2015
.
19. 
Taranow
WS,
Moore
J:
Hallux rigidus: a treatment algorithm
.
Tech Foot Ankle
11
:
65
,
2012
.
20. 
Shurnas
PS:
Hallux rigidus: etiology, biomechanics and nonoperative treatment
.
Foot Ankle Clin
14
:
1
,
2009
.
21. 
Hanft
JR,
Mason
ET,
Landsman
AS,
et al:
A new radiographic classification for hallux limitus
.
J Foot Ankle Surg
32
:
397
,
1993
.

Author notes

*

Department of Podiatry, West Houston Medical Center, Houston, TX.

Baylor College of Medicine, Houston, TX.