The International Knee Documentation Committee Subjective Knee Evaluation Form (IKDC) is the most frequently used patient-reported measure of subjective knee function among individuals with anterior cruciate ligament reconstruction (ACLR). Yet, due to the limitations of traditional validation approaches, whether the IKDC measures knee function as intended is unclear. Rasch analysis offers a robust validation approach, which may enhance the clinical interpretation of the IKDC.
To assess the psychometric properties, ability to classify health status, and relationships between the IKDC and objective measures of strength and functional performance relative to a newly proposed reduced-item instrument.
A total of 77 individuals with primary unilateral ACLR (age = 21.9 ± 7.8 years, time postsurgery = 6.2 ± 1.0 months) and 76 age-matched control individuals (age = 22.0 ± 4.2 years).
Rasch analysis was used to assess the psychometric properties of the IKDC. Receiver operator characteristic curves and logistic regression were calculated to assess the accuracy of classifying participants with ACLR versus control participants. Pearson product moment and Spearman rank order correlation analyses were conducted to evaluate relationships among subjective knee function, quadriceps torque, and single-limb hop performance.
Rasch analysis aided the development of a reduced 8-item instrument (IKDC-8), which yielded improved psychometric properties in the rating scale performance (IKDC-8 = 0, IKDC = 3 nonmonotonic “misbehaving” items), percentage of variance accounted for by 1 dimension (IKDC-8 = 71.5%, IKDC = 56.7%), and precision in item separation (IKDC-8 = 9.79, IKDC = 5.02). The IKDC was an outstanding diagnostic tool, and the IKDC-8 was excellent, correctly classifying 87.2% and 82.7% of cases, respectively. Using the Hanley-McNeil formula, we found no difference in the areas under the respective receiver operator characteristic curves. Equivalent associations between subjective and objective knee function were observed regardless of the instrument used.
We demonstrated evidence of enhanced reliability and validity for a parsimonious measure of subjective knee function. The proposed instrument reduces the number of items, increases the score interpretability as measuring a single construct, and improves the rating scale functioning while not diminishing its ability to classify participants with ACLR versus control participants or changing existing relationships with objective measures of recovery. We suggest the IKDC-8 may enhance clinical use by reducing administration time, improving the interpretation of the subjective knee function score, and clarifying functional ability.
Use of the Rasch measurement model aided in the development of a reduced-item instrument that measures subjective knee function.
Compared with the International Knee Documentation Committee Subjective Knee Evaluation Form (IKDC), the reduced-item IKDC-8 demonstrated greater evidence of measuring subjective knee function alone and no other constructs, displayed improved function of the response categories for each item, and placed respondents on a reliable continuum of knee function.
With fewer items and clearer response options than in the original form, the IKDC-8 can help clinicians accurately map respondents' progress in their rehabilitation after anterior cruciate ligament reconstruction.
Over the past several decades, the use of patient-reported outcome measures (PROMs) has continued to increase in clinical importance.1 Patient-reported outcome measures provide information on an individual's experience during medical treatment and recovery processes and serve as one of the primary pillars of quality health care.2 The increased attention to patient-oriented outcomes corresponds with more emphasis being placed on patient involvement and satisfaction in health care decision making.3 Alongside the prominent role of objective measures, recent researchers4 of meta-analyses reported the increasing importance of PROMs in return-to-sport (RTS) decisions for individuals with anterior cruciate ligament reconstruction (ACLR). Yet in spite of the greater use of PROMs, practitioners today echo the limitations noted nearly 25 years ago regarding insufficient reliability standards and the role of PROMs in guiding clinical diagnoses and recommendations.5,6
The International Knee Documentation Committee Subjective Knee Evaluation Form (IKDC) is one of the most prominent PROMs used to assess knee function after ACLR.7 In addition to commonly used objective measures of muscle strength and functional performance (ie, hop tests), the IKDC has been included in a number of RTS test batteries.8 Moreover, the relationships between subjective and objective knee function have been widely reported.9,10 For example, better subjective knee function has been associated with several critical functional measures among individuals with ACLR, including greater unilateral and symmetric knee-extensor torque,9 better single-legged hop (SLH) performance,11 quadriceps strength symmetry,12 and overall successful performance on RTS tests with high sensitivity.13 These data have suggested that the IKDC may be a useful clinical indicator of functional recovery after ACLR and may have utility as a screening tool when incorporated in a battery of tests aimed at assessing overall patient health status.9,11–13
Previous investigators14,15 assessing the reliability of the IKDC have relied primarily on classical test theory, an approach with long-noted limitations in establishing a stable psychometric instrument.1 Classical test theory techniques used to develop instruments can typically ensure internal scale consistency (ie, correlation among items) but do not sufficiently address other assumptions of construct validity, including specific objectivity and unidimensionality. For a PROM to be useful as a meaningful outcome in any clinical setting, it is necessary to establish how reliably the responses on the instrument can be summed to produce a valid measure of the intended construct (eg, subjective knee function).16 No authors have fully evaluated the IKDC using the Rasch measurement model (RMM), which is considered the most robust approach for providing defensible evidence of the scientific rigor of PROM instruments.17,18 Specifically, instrument validation via RMM uses probabilistic expectations to estimate the extent to which participant response patterns match the expected development for subjective knee function.
Therefore, the primary purpose of our study was to use the RMM to assess the psychometric properties of the IKDC as a patient-reported measure of knee function. Based on the interpretation of the psychometric diagnostics and content expertise, we proposed a shortened version of the IKDC. To evaluate the construct validity of the resultant IKDC, we examined its performance using several clinical metrics. Accordingly, our secondary purpose was to evaluate the ability of the original and resultant IKDC instruments to classify participants with and those without a history of ACLR by using identified target values of subjective knee function. Our tertiary purpose was to assess the existing relationships between subjective knee function and common objective measures of patient recovery using each instrument.
We performed a retrospective analysis of data collected using a cross-sectional design to investigate the individual IKDC item responses in both individuals with ACLR and a control group of similar age and sex distributions. Eighty individuals with a history of primary unilateral ACLR were recruited from a university sports medicine clinic, university student body, and local community near the time of physician clearance over a 3-year period. Eighty age-matched individuals were enrolled as control participants. Three participants from the ACLR group and 4 participants from the control group did not complete the IKDC and were removed from the analysis, leaving 77 participants with ACLR and 76 control participants (Table 1). To be eligible for the ACLR group, patients must have undergone surgery with either a bone-patellar tendon-bone or hamstrings tendon autograft and could not have a history of failed ACLR or previous lower extremity surgery. Participants in the control group could not have a history of lower extremity surgery or injury within the 6 months before the study. This study was approved by the Institutional Review Board for Health Sciences Research at the University of Virginia, and all participants provided written and oral informed consent.
All participants completed the study procedures in the order described in this section. Outcomes were measured bilaterally, beginning with the uninvolved limb (ACLR group) or control limb (control group) in a counterbalanced order. The nondominant limb of the control group participants was identified as the limb not used to kick a ball and matched with the involved limb of the ACLR group participants.
Knee-extension maximal voluntary isometric contraction torque was measured using a stationary dynamometer (model Systems 3; Biodex Medical Systems Inc) with the knee flexed to 90° as previously described.19 After a brief period of familiarization, participants were instructed to kick out as hard as possible. Visual feedback of the torque output was provided, and oral encouragement by the investigator was used to ensure that each person gave maximal effort. The average of 3 maximal voluntary isometric contraction trials was recorded and normalized to body mass (Nm/kg). Limb symmetry indices were expressed as a percentage of the average involved limb performance divided by the average uninvolved limb performance.
Single-Legged Hop Performance
The SLH for distance was measured as described earlier.20 Participants performed 3 practice trials followed by 3 test trials. The average of the 3 test trials was recorded, normalized to body height, and expressed as a unitless ratio. The limb symmetry index was expressed as a percentage of average involved limb performance divided by the average uninvolved limb performance.
Subjective Knee Function
The IKDC14 (copyright 2000, American Orthopaedic Society for Sports Medicine), consisting of 19 knee-specific questions related to symptoms, sport activities, and function, was used to quantify subjective knee function. The traditional method was used to calculate scores on a scale from 0 to 100. Higher scores indicated fewer symptoms and higher overall function. Researchers14 have reported an internal consistency coefficient of 0.92 among a sample of patients with a variety of knee injuries and disorders (24% reported ACL injury).
We conducted Rasch analysis of the IKDC data via the Rating Scale Model21 using WINSTEPS (version 4.5.2; Linacre). An initial analysis of the IKDC responses of the 153 participants established baseline values for key measurement diagnostics, as follows: (1) rating scale performance, (2) unidimensionality, and (3) person and item fit indices with their corresponding reliability statistics. Then, based on the interpretation of the diagnostics, changes were made iteratively to the IKDC.22
Rating scale performance was assessed by examining the item response categories to determine if they functioned as intended. For the IKDC items, this implied that participants with higher levels of knee function would endorse higher response categories in a linear and consistent manner across the items. In addition, each response option should have represented a distinct qualitative and quantitative meaning.
We performed unidimensionality tests to determine whether the measure of knee function was confounded by items that represent >1 construct. Rasch diagnostics indicated the extent to which the variance in participant responses to IKDC items could be explained solely by the knee function of the participant and the difficulty of the items. Greater amounts of variance explained by the principal component indicated better adherence to the principle of unidimensionality, with values of >60% typically required.23
Person and item fit indices were explored to examine how well individuals met the expectation of the RMM. To distinguish among levels of subjective knee function, person separation was expected to be >2.0 with reliability of >0.8, and item separation was expected to be >3.0 with reliability of >0.9. Residual mean square fit statistics were generated to determine the extent to which the IKDC items fit a unidimensional linear measure of subjective knee function, with an expected value of 1.0 for each item and those with >2.0 interpreted as a possible threat to the measurement system.22
Classification of Participants With ACLR Versus Control Participants
Receiver operator characteristic (ROC) curves were used to assess the diagnostic utility of the IKDC and a reduced-item IKDC. The positive actual state was identified as a history of ACLR with the implicit understanding that lower scores on the respective IKDC instruments indicated greater impairments in subjective knee function. The area under the ROC curve (AUC) was used to determine the ability of each instrument to correctly discriminate participants with ACLR versus control participants and was interpreted as no (<0.5), acceptable (0.7–0.8), excellent (0.8–0.9), or outstanding (>0.90) discrimination.24 Each ROC curve was visually inspected to identify a target value that maximized the sensitivity (true positives/true positives + false-negatives) and specificity (true negatives/true negatives + false-positives) of the instrument using the Youden Index. The sensitivity, specificity, and positive and negative likelihood ratios were calculated for each identified target value. The Hanley-McNeil formula was applied to determine if the AUCs differed among the instruments.25 Binary logistic regression analyses were conducted to assess the ability of the selected target values to accurately classify participants with ACLR versus control participants. We used χ2 analyses to compare the proportion of control individuals and those with ACLR meeting or not meeting established target values between instruments.
Relationships Between Subjective and Objective Function
Bivariate correlation coefficients were generated to assess the associations among subjective knee function, knee-extensor torque, and SLH performance in the ACLR group only using each IKDC instrument. The Pearson product moment correlation (r) was applied for normally distributed outcomes, and the Spearman rank order correlation (ρ) was applied for nonnormally distributed outcomes as determined with the Shapiro-Wilk test. Coefficients were classified as negligible (0–0.29), low (0.3–0.49), moderate (0.5–0.69), high (0.7–0.89), or very high (0.9–1).26 The correlations between the functional performance measures and the respective IKDC instruments were tested for any statistical differences using the Fisher r-to-z transformation and Steiger formula.27 The level of statistical significance for all analyses was set a priori at a P value of ≤.05 and 1 − β = 0.80. All analyses except for the Rasch analysis were performed using SPSS (version 25; IBM Corp).
Group characteristics are presented in Table 1. The ACLR group was taller (P = .040) and demonstrated less knee-extension torque, knee-extension torque symmetry, SLH distance, SLH symmetry, and IKDC scores (all P values < .001) than the control group. Individuals in the ACLR group were enrolled at a mean of 6.2 months after surgery.
The RMM was used to evaluate the IKDC (Table 2). First, the rating scales were changed to address categories with insufficient selection (ie, <10 observations of that category across all items) and categories that advanced in a nonmonotonic fashion. Next, items with a mean square statistic of >2 were removed from the instrument (IKDC items 8, 9c, and 9e). Based on this iterative process, 10 of the 18 items used to score the IKDC were removed. This resulted in an 8-item version of the IKDC (IKDC-8; Appendix Figures 1 and 2) with a raw score range of 0 to 24.
Rasch analysis of the IKDC-8 revealed several domains of psychometric superiority and several domains of noninferiority compared with the IKDC (Table 3). The IKDC-8 rating scale displayed a better distribution of observations across rating scale categories for the 3 items (IKDC items 2, 3, and 10b) that used an 11-point scale in the original IKDC. The mean square statistics for item infit and outfit fell within the accepted range for high-stakes tests of 0.8 to 1.2 across all 8 items.23 The analysis for unidimensionality in the IKDC-8 instrument accounted for 71.5% of the variance in participant responses to the IKDC-8 items, with no evidence that the items were eliciting systematic information beyond that associated with subjective knee function. The Figure provides a visual representation of the unidimensional construct of subjective knee function measured using the IKDC-8. The person separation index for both the IKDC and IKDC-8 indicated that each instrument could separate participants into at least 3 distinct levels of knee function. The person reliability was moderately high (0.83), implying that the items on the IKDC-8 would produce reliable measures of subjective knee function among another group of similar participants. The order of item difficulty was reproducible, with an extremely high reliability of 0.99.
Classification of Participants With ACLR Versus Control Participants
Relationships Between Subjective and Objective Function
Regardless of the instrument used, a higher rating of subjective knee function demonstrated a negligible association with greater knee-extension torque (IKDC, r = 0.251; IKDC-8, r = 0.265; P = .90) and low associations with greater SLH distance (IKDC, r = 0.403; IKDC-8, r = 0.442; P = .69) and SLH symmetry (IKDC, ρ = .332; IKDC-8, ρ = .355; P = .83) in the ACLR group. Neither instrument was associated with knee-extension torque symmetry (IKDC, r = −0.120; IKDC-8, r = −0.072; P = .68). The magnitude of association did not differ between instruments.
Our primary aim was to evaluate the IKDC by using the RMM to assess the suitability of the instrument for measuring the construct of subjective knee function. An iterative approach led to the development of the IKDC-8, a reduced-item version of the IKDC that demonstrated superiority in rating scale performance, unidimensionality, and person and item fit indices. The observed improvements provided evidence of increased construct validity by means of enhanced instrument performance. Using a Rasch analysis of the IKDC-8 provided additional information to help clinicians evaluate how well patient reports of knee function fit the underlying theoretical construct (as shown in the Figure). Despite including fewer items, the IKDC-8 explained more variance in how participants responded to individual items than the IKDC. We observed ceiling effects, with 37% (n = 57/153; 3/57 ACLR) of the sample achieving a maximal measure (score = 24). Including the control individuals was necessary to establish a meaningful “healthy” threshold and did not change the ordering of the items or fundamentally alter the interpretation of the construct of knee function. Collectively, the IKDC scores in our sample of individuals with ACLR were consistent with published normative data for those with and those without a history of knee injury (82.1 ± 11.7 versus 82 ± 22, respectively).28 Despite the referenced data including a heterogeneous sample (based on injury history), our findings aligned with previous reports29,30 of IKDC scores among individuals 6 months after primary unilateral ACLR, suggesting that our sample was representative of this population.
The Rasch analysis of the IKDC-8 developed greater clarity in the meaning of response categories and flagged items in the IKDC that did not fit within the theoretical model of subjective knee function. In developing the IKDC-8, we removed items from the IKDC that did not fit in a linear, progressive measure of knee function. Several items did not fit the unidimensional construct of knee function. The 2 symptom items (IKDC 4 and 6) were identified as a secondary dimension and, furthermore, did not have sufficient observations in all rating categories. The 4 activity-level items (IKDC 1, 5, 7, and 8) also formed an additional dimension, in line with findings from an earlier study.31 The previous knee function item (IKDC 10a), which is not used in calculating the typical IKDC score, was removed here as well because the construct of interest pertains to current knee function. Three of the “daily activity items” were significant misfits with the model (mean square fit statistics >2). By removing misfitting items and addressing concerns in how participants used the rating scales, we noted a substantial increase (15%) in explaining the variability among participants' responses. In addition to the examination of fit statistics, combining ambiguous categories reduced the measurement error associated with the confusion of adjacent, unlabeled categories (eg, the difference in meaning between a score of 6 versus 7).
An additional benefit of using the RMM to develop the IKDC-8 was the placement of items on an equal-intervals scale, which allowed a clear comparison of relative item difficulty. This feature enabled easier identification of individuals who provided highly unexpected answers that deviated from the established model of subjective knee function. On the Figure, although the 2 participants displayed on the right achieved the same IKDC-8 score (raw score of 19 converted to traditional 0–100 scale = 79.2), their response patterns revealed important differences. The responses from the first participant (ACLR no. 2) conformed almost perfectly with expectation (eg, a person with an IKDC-8 score of 79.2 would have no difficulty with uniplanar activities; minimal difficulty with multiplanar activities; and occasional, mild pain). The second participant (ACLR no. 35) was a curious case in which the individual expressed no difficulty in uniplanar and multiplanar activities and no limitation in daily activities and yet had constant moderate to severe pain. The ability to identify how well responses from individuals with ACLR in rehabilitation contexts fit the expected model of subjective knee function offers clinicians and researchers greater diagnostic clarity and clinical guidance because deviation from expectation is a case in which intervention is most important. Aberrant response patterns could indicate that the participant misunderstood an item or that the unexpected response reflects a deficiency in his or her functional status or rehabilitation program.
We also observed similar findings relative to the ability of the IKDC-8 to accurately classify ACLR versus control participants. For example, the IKDC-8 demonstrated an excellent ability (AUC = 0.8–0.9) to discriminate between individuals with ACLR and control participants, whereas the IKDC demonstrated an outstanding ability to do so (AUC, >0.90). However, the CIs of the associated AUC values for each instrument overlapped considerably and were not statistically different, suggesting comparable diagnostic utility. The identified target values for each instrument were highly sensitive, yet the IKDC-8 displayed less specificity than the IKDC (0.71 versus 0.83). Accordingly, the IKDC-8 yielded a stronger negative likelihood ratio but a smaller positive likelihood ratio than the IKDC. From a practical standpoint, these data indicate that scoring ≥93.8 on the IKDC-8 resulted in a 3-fold increase in the odds of being classified as a control participant, whereas scoring <93.8 yielded a 14-fold increase in the odds of being classified as having ACLR. In contrast, a target value of 94.9 on the IKDC yielded a 5-fold increase and 9-fold decrease in the same odds, respectively. As expected, most control individuals met the identified target value of each instrument (IKDC = 83%, IKDC-8 = 71%), whereas most individuals with ACLR did not meet it (IKDC = 91%, IKDC-8 = 95%). Although the passing and failing proportions in each group did not differ statistically, the IKDC-8 instrument appeared to be more difficult for control individuals and those with ACLR to pass on average, which could contribute to a more conservative assessment of perceived knee function if used in clinical practice. The identified target values appear similar to those in an investigation by Zwolski et al,12 who reported that achieving an IKDC score of ≥94.8 was a sensitive indicator of quadriceps strength symmetry at the time of RTS after ACLR. Lentz et al32 determined that achieving an IKDC score of ≥93 indicated a higher likelihood of readiness to RTS after ACLR. These values are similar to the mean IKDC scores of our control group (IKDC = 97.7, IKDC-8 = 95.0), which suggests that our identified target values were appropriate indicators of control status.
Better subjective knee function was associated with greater quadriceps strength, SLH distance, and SLH symmetry among individuals with ACLR, which agrees with previous reports.11,33 However, we did not observe an association between subjective knee function and quadriceps strength symmetry, which has been described.12,34 Although relationships between subjective and objective knee function can be influenced by the magnitude of impairment, our individuals with ACLR appeared to demonstrate impairments in quadriceps strength, quadriceps strength symmetry, SLH distance, and SLH symmetry consistent with those in the literature.35 The lower self-reported current activity level in the ACLR group could partially explain the objectively measured impairments, even though these relationships were unclear in the literature.10 Nevertheless, the observed relationships were of negligible to low magnitude, suggesting that each instrument of knee function may provide unique information about individuals recovering from ACLR that is distinct from objective measures of quadriceps and lower extremity function. This appears to indicate that perceived function differs from measured function and supports the utility of each in the RTS decision-making process. According to the observed relationships, the IKDC-8 seems to supply similar information to the IKDC, which offers early support for its clinical utility as an indicator of perceived functional recovery after ACLR.
Our initial findings pose important implications for measuring subjective knee function during the early rehabilitation process as an indicator of recovery. Stakeholders from clinicians and researchers to patients and insurance providers continue to invest in and seek improvements in the quality of care when it comes to rehabilitation outcomes.1 Patient-reported outcome measures will continue to play an important part in evaluating the success of rehabilitation treatments. Implementing an instrument, such as the IKDC-8, that was developed using the RMM may lead to greater precision and clarity for measuring the subjective knee function of individuals with ACLR. Item-level analysis can provide stakeholders with more clarity about what patients are experiencing and guide clinical decision making. Continually referring to how patient responses fit the established continuum of knee function will facilitate better informed clinical care.
Our findings have the potential to improve clinical assessments, but 2 limitations need to be considered. First, the IKDC-8 responses were derived from the original IKDC instrument. Responses from an existing IKDC data set were analyzed and modified under the guidelines of the RMM. These results should be interpreted cautiously until the IKDC-8 can be administered and analyzed in an independent sample. As with any instrument, greater precision will come with using the proposed IKDC-8 among a broader range of participants and a variety of populations extending beyond individuals with primary, unilateral ACLR. Allowing participants to use the newly proposed rating categories could lead to future refinement or change. Although all the steps undertaken in the development of the IKDC-8 followed well-established guidelines,22,36 future investigators who use the IKDC-8 in the proposed format will provide greater information on the reliability and clinical utility of the IKDC-8. The second limitation pertains to the cross-sectional nature of studying the same cohort for the exploratory ROC and correlational analyses. The statistical methods we used to compare the results produced by the respective instruments took into account the commonalities in the standard errors introduced by the same sample. Even with that consideration, researchers examining separate cohorts can work toward a more confirmatory analysis.
Application of the RMM led to the proposal of a reduced-item instrument of subjective knee function, namely the IKDC-8. The IKDC-8 exhibited superior rating scale performance (greater clarity in meaning of response categories and elimination of items with confounding meanings), dimensionality (better understanding of knee function with fewer items and a single construct of knee function), and item fit (increased precision in item difficulty), as well as noninferior reliability statistics and the ability to differentiate participants relative to the IKDC. The reduction in the number of response items of the IKDC-8 may minimize administration time, improve the interpretation of the resulting score by permitting analysis of response patterns, and enhance scale functioning by clarifying response categories. These changes reflected in the IKDC-8 did not appear to diminish the instrument's ability to classify participants with ACLR versus control participants and did not change existing relationships with objective measures of recovery. Implementation of the IKDC-8 may be useful to further clarify patient recovery along a continuum of lower to higher levels of knee function after ACLR.
We thank the American Orthopaedic Society for Sports Medicine for permitting the use of the International Knee Documentation Committee Subjective Knee Evaluation Form.