The Patient-Determined Disease Steps (PDDS) scale is a patient-reported measure of disability used by at least 3 North American multiple sclerosis (MS) registries. We conducted a systematic review of the psychometric properties of the PDDS scale as part of a harmonization effort related to disability measures used in MS registries.
We searched the EMBASE, Ovid Medline, Scopus, Cochrane Database of Systematic Reviews, CENTRAL, CINAHL Plus, and ClinicalTrials.gov databases from database inception through July 28, 2020. Two reviewers independently screened abstracts and full-text reports for study inclusion and data extraction and assessed study quality and risk of bias. We included studies that assessed the validity or reliability of the PDDS scale. We conducted a meta-analysis to quantitatively summarize the findings.
From the 2476 abstracts screened, 234 articles underwent full-text review, of which 5 met the inclusion criteria. These studies assessed criterion validity, construct validity, and test-retest reliability. In all studies, criterion validity was assessed by correlating the PDDS scale score with the Expanded Disability Status Scale score (pooled r = 0.73; 95% CI, 0.66–0.79). Test-retest reliability was high (pooled intraclass correlation coefficient = 0.96; 95% CI, 0.92–0.99).
In this systematic review, the PDDS scale demonstrated criterion and construct validity for assessing disability in individuals with MS who have mild to moderate disabilities. This review also supports the test-retest reliability of the PDDS scale, although further studies with larger samples are needed.
Multiple registries and observational studies for multiple sclerosis (MS) in North America1 have captured information about disability and disease progression. These data have the potential to be leveraged further by collaboration between these efforts,2 but the studies have different objectives, different methods of data collection (patient-reported vs clinician-reported vs performance-based), and different measures of the same underlying construct, limiting the ability to combine or compare findings across studies. Retrospective data harmonization can improve comparability of similar measures collected across different studies. An initial step to harmonization is to determine what measures are being used and their performance characteristics.3
The Patient-Determined Disease Steps (PDDS) scale is a patient-reported measure of disability used by at least 3 North American MS registries. The PDDS scale began as Disease Steps, a clinician-assessed tool initially developed to address challenges with application of the Expanded Disability Status Scale (EDSS) score in clinical practice (eg, poor interrater reliability).4 From there, it became a patient-reported outcome (PRO) measure and has now been used in more than 100 studies. The PDDS scale includes a single item with 9 levels ranging from 0 (normal) to 8 (bedridden).
We conducted a systematic review of the psychometric properties of the PDDS scale as part of a harmonization effort related to disability measures used by MS registries.
We conducted this review according to an a priori published protocol (PROSPERO #CRD42020211745) and report the findings according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) criteria.5 The primary review question was: What are the validity and reliability of disability measures used by North American MS registries participating in the initial phase of the Multiple Sclerosis Metadata Collective (MSMDC, https://www.maelstrom-research.org/network/msmdc)? The participating registries that have listed the measures they use include the Canadian Pediatric Demyelinating Disease Network (CPPDN), iConquerMS People-Powered Research Network (iConquerMS), Johns Hopkins Precision Medicine Center of Excellence for Multiple Sclerosis (JHMS), the Veterans Affairs Multiple Sclerosis Surveillance Registry (MSSR), North American Research Committee on Multiple Sclerosis Registry (NARCOMS), North American Registry for Care and Research in Multiple Sclerosis (NARCRMS) and Pediatric Multiple Sclerosis and other Demyelinating Diseases Database (PeMSDD). The identified measures used were the EDSS or a modified version of the EDSS (NARCRMS, PeMSDD, CPDDN), the Multiple Sclerosis Functional Composite or the related MS Performance Test (NARCRMS, JHMS), and the PDDS scale (NARCOMS, iConquerMS, JHMS). The EDSS has been the subject of previous reviews, as has the Multiple Sclerosis Functional Composite6 ; therefore, we present findings related to the PDDS scale.
We defined the psychometric or measurement properties of the PRO measure of interest (ie, the PDDS scale) according to the following 6 Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) consensus criteria.7 (1) Content validity refers to the degree to which the content of the PRO adequately reflects the construct being measured. (2) Criterion validity refers to the degree to which the scores on the PRO adequately reflect a gold standard. (3) Construct validity refers to the degree to which the scores on the PRO are consistent with hypothesized associations with other measures or differences between groups. (4) Internal consistency reliability refers to the degree of interrelatedness among the items; this did not apply to the PDDS scale because it is a single-item measure. (5) Responsiveness refers to the degree to which the PRO can detect changes over time in the construct being measured. (6) Test-retest reliability refers to the extent to which scores remain the same on repeated administration for respondents who have not changed.
Inclusion and Exclusion Criteria
We included original studies of adults with MS (any disease course) that assessed psychometric properties of disability measures using a noninterventional study design.
A medical librarian (L.Y.) searched the literature for records that included the concepts of MS, disability scales, and validity with modified versions of the Terwee et al filter.8 We did not limit the search to disability measures in North American registries at this stage to ensure a comprehensive search and because work to identify which measures were used by North American registries was concurrent. The librarian created search strategies using a combination of keywords and controlled vocabulary in Embase, Ovid Medline, Scopus, the Cochrane Database of Systematic Reviews, the Cochrane Central Register of Controlled Trials, the Cumulative Index to Nursing and Allied Health Literature Plus, and ClinicalTrials.gov. The full search strategies are shown in APPENDIX S1, which is available online at IJMSC.org. All search strategies were completed on July 28, 2020, with no added database-supplied filters or limits. De-duplication of identified citations was performed using EndNote.9
We used a 3-step process for study selection and data extraction. Using EPPI-Reviewer 4 software,10 2 reviewers (R.A.M. and C.M.) independently screened the titles and available abstracts of search results to determine whether they were validation studies of disability measures for individuals with MS. The full text of all reports classified as “include” or “unclear” by either reviewer was retrieved for formal review. In the second step, the 2 reviewers independently assessed the full text of each report by using a standardized form in EPPI-Reviewer 4 that outlined the predetermined inclusion and exclusion criteria. Disagreements were resolved by discussion between the 2 reviewers or by third-party adjudication (A.S.), as needed. At this point we had identified the disability measures used by North American MS registries and thus limited the studies selected for full-text review to those with the aim of validating the PDDS scale.
Data Extraction and Management
We developed a data collection tool and implemented it in EPPI-Reviewer 4; data abstraction was completed by 1 reviewer (C.M.) and verified by a second reviewer (R.A.M.). Information extracted included country, setting (eg, specialty clinic), participant inclusion and exclusion criteria, sample size, type of validity or reliability being assessed, the criterion standard (if applicable), and performance of the measure being assessed (eg, correlations, internal consistency reliability, test-retest reliability).
Risk of Bias (Quality Assessment)
The Quality Assessment of Clinical Measures Research Evaluation Report was used to assess the quality of the studies included. It consists of 12 items that assess (1) the adequacy of the literature review used to establish the research question; (2) explicit inclusion and exclusion criteria; (3) the specific hypotheses or research questions; (4) the appropriate scope of measurement properties considered; (5) the appropriate sample size; (6) the follow-up; (7) how the authors described the measures used and how they were measured; (8) the use of standardized measurement techniques; (9) whether findings are presented for each hypothesis proposed; (10) whether appropriate statistical testing was conducted to obtain point estimates; (11) whether appropriate analyses were completed to estimate precision of the estimates; and (12) whether clear, specific conclusions and recommendations were made.11,12 Each item is scored as 0, 1, or 2. We summed the scores for each item, divided by the maximum possible score, and multiplied by 100%; this constituted the article’s overall quality score.12 We classified the quality as poor (0%–30%), fair (31%–50%), good (51%–70%), very good (71%–90%), and excellent (>90%).12
We also used the COSMIN Risk of Bias Checklist to assess the risk of bias.13 This checklist provides a separate assessment of bias for each component of validity (eg, criterion, construct) and reliability (eg, test-retest). The overall quality of a component of a study is based on the lowest rating of any standard related to that component and may be classified as inadequate, doubtful, adequate, or very good.
We summarized the findings of all included studies using descriptive statistics. Measures of validity (eg, correlation coefficients) were retrieved from the paper when available and pooled using random-effects meta-analysis after the correlations were converted to z values using Fisher transformation; the pooled estimates were then converted back to correlation coefficients. We classified correlations of 0.3 or less as weak, 0.4 to 0.6 as moderate, and 0.7 to 1.0 as strong. The amount of heterogeneity was quantified using the I2 statistic,14 and its significance was determined based on the accompanying Q P value.
Statistical analyses were conducted using Stata Statistical Software: Release 14 (StataCorp LP).
Results of Search
The search identified 4177 citations, of which 1701 were duplicates. Of the remaining 2476 citations, 2242 were excluded at the title/abstract review stage (FIGURE S1). Of 234 articles selected for full-text review, 4 could not be retrieved, and another 92 were excluded. Of the 138 remaining articles, 5 evaluated the psychometric properties of the PDDS scale. Of note, 7 other studies sought to validate another measure and reported correlations between those measures and the PDDS scale.15–21
Description of Studies
Details of the included studies are shown in TABLE S1. Dates of publication ranged from 2013 to 2019. Study sample sizes ranged from 63 to 103; in total, the studies included 458 participants. Across all studies, most participants were women (n = 332, 72.4%). The median level of disability, based on the PDDS scale, was generally in the mild to moderate range.
Four of the studies evaluated the performance of the PDDS scale after translation to other languages (Portuguese, Turkish, Italian, Spanish); none of these studies formally assessed cross-cultural validity by testing the measurement invariance of the PDDS scale. One study evaluated a telephone-administered version of the PDDS scale rather than a self-administered version.
All 5 studies assessed criterion validity using the EDSS as the criterion standard (TABLE S2). Correlations between the PDDS scale and EDSS scores ranged from r = 0.61 to r = 0.79.
With respect to construct validity, the studies chose varying comparators (Table S1) and sometimes differed with respect to the classification (convergent vs divergent) of a comparator (eg, age). Generally, the expected associations between related constructs, such as mobility, were observed. None of the identified studies assessed known discriminant validity.
Three studies reported test-retest reliability for the conventional PDDS,22–24 all with intraclass correlation coefficients exceeding 0.90. None of the studies assessed responsiveness. One study reported internal consistency reliability, but it is unclear how this was obtained because this cannot be calculated for scales with a single item, such as the PDDS scale.
On meta-analysis of the correlations for criterion validity between the PDDS scale and the EDSS, the summary correlation was 0.73 (95% CI, 0.66–0.79) (TABLE 1).
With respect to convergent construct validity, the summary estimates showed moderate to strong correlations 6-min the PDDS scale and other measures that assess mobility, including the Timed 25-Foot Walk test, the 6-Minute Walk Test, and the Timed Up and Go test (Table 1). For divergent construct validity, 3 studies reported the correlation between education and PDDS scale scores; the summary estimate showed a weak correlation, as expected. The summary estimate for test-retest reliability was 0.96 (95% CI, 0.92–0.99).
Quality and Risk of Bias
Study quality ranged from good (n = 2) to very good (n = 3) (TABLE 2). Study quality was most often reduced by lack of appropriate sample size justification or a relatively narrow focus in terms of psychometric evaluation (ie, evaluating only 1 aspect of validity and reliability). The risk of bias for criterion validity was low in most studies, and it was low in all studies for construct validity (TABLE S3). Risk of bias was higher in studies that assessed test-retest reliability due to small sample sizes and the lack of convincing evidence that study participants were, in fact, stable over time.
In this systematic review, we identified 5 studies that provided information regarding the psychometric performance of the PDDS scale for assessing disability in people with MS. Based on these studies, the PDDS scale may be considered a useful measure for capturing patient-reported disability. The PDDS scale is moderately correlated with a clinician-scored EDSS. In addition to criterion validity, we found evidence of convergent and divergent construct validity. The PDDS scale correlates moderately with other measures of mobility, including the Timed 25-Foot Walk test, the 6-Minute Walk Test, and the Timed Up and Go test, but it is not associated with education. Although the evidence is more limited because it is based on 3 studies using small samples, the PDDS scale also seems to have excellent test-retest reliability.
When selecting a measure for use in clinical or research settings, there are considerations beyond psychometric properties. Similar to the EDSS, the PDDS scale focuses on mobility. Although it is moderately correlated with the EDSS, it does not capture the effects of MS on other important domains, such as cognition, vision, upper limb function, pain, and fatigue. Other measures are needed if the goal is to capture the full breadth of the effects of MS. Nonetheless, the PDDS scale has several potential advantages. It is freely available, and it can be completed very quickly by people living with MS, limiting response burden. Due to its brevity, the PDDS scale may be particularly useful in large epidemiologic studies or studies with frequent reevaluations where clinician-assessed or performance-based measures of disability are not feasible. Recent cataloguing efforts among MS registries build a foundation for future collaborations suited to studying outcomes such as disability over longer time frames. When retrospective harmonization is feasible, these collaborations will provide increased sample sizes and a greater ability to examine specific subpopulations than can be achieved within a single study. These investigations require reliable and valid tools to ensure robust characterization over time. However, in a recent illustration of retrospective harmonization among 3 registries, disability was identified as a particular challenge due to the number of different instruments used.2 Understanding the properties of each measure is a critical first step toward harmonization. Of the registries in the MSMDC, 3 used the PDDS scale.
This review was guided by a standardized protocol. The quality evaluations indicated that the studies reviewed were of good or very good quality. We excluded 7 studies that were designed to assess the validity of another MS disability measure. These studies could have provided additional support for the validity of the PDDS scale but would have been difficult to evaluate in terms of study quality and risk of bias given that they did not meet the intended inclusion criteria.
The present findings highlight areas for improvement. The PDDS scale was assessed in samples that predominantly included women with mild to moderate disability as assessed by the PDDS scale. The mildest rating of disability by the PDDS scale is 0 (normal); this rating does not distinguish between someone without symptoms and a normal neurologic examination (EDSS score = 0) and someone with no symptoms but abnormalities on examination (EDSS score = 1). This potentially reduces concordance with the EDSS at the lower end of the scale and introduces measurement error. The most severe PDDS scale rating of disability is 8 (bedbound). In samples in which participants have severe disabilities, this may impose a ceiling effect and limit the ability to detect change; further assessment for individuals with severe disabilities is needed. Further evaluation of PDDS scale criterion validity in samples with a broader range of disability, including examination of variations in scale validity across the scale range, would also be useful. Adequate sample size justifications were lacking, and samples were particularly small for test-retest reliability. This should be addressed in future studies. Future studies should also provide more comprehensive assessments of validity and reliability. None of the studies assessed the PDDS scale’s responsiveness to change or ability to discriminate between subgroups (discriminative validity). The studies that translated the PDDS scale did not formally assess measurement invariance; that is, determining whether the original and translated questionnaires measure the same constructs. This also merits further study. In addition to assessing quality (eg, the presence of methodological safeguards), we also assessed the risk of bias. Overall, the risk of bias was low for criterion and construct validity based on the COSMIN Risk of Bias Checklist.
This systematic review and meta-analysis supports the criterion validity and construct validity of the PDDS scale for assessing disability in samples of people with MS with mild to moderate disabilities. Further investigation is needed to establish the PDDS scale’s general performance and responsiveness in people with MS who have severe disabilities.
The Patient-Determined Disease Steps (PDDS) scale is a brief, patient-reported measure of disability in multiple sclerosis.
This review supports the criterion validity and construct validity of the PDDS scale to assess disability in individuals with multiple sclerosis and mild to moderate disabilities.
The PDDS scale may be useful in settings where clinician-assessed or performance-based disability measures are not feasible.
FUNDING/SUPPORT: This study was funded by the National Multiple Sclerosis Society (SI-1907-34399). Dr Marrie is supported by the Waugh Family Chair in Multiple Sclerosis and a research chair from Research Manitoba. Dr Salter is supported by a Biostatistics/Informatics Junior Faculty Award from the National Multiple Sclerosis Society.
DISCLOSURES: Dr Marrie receives research funding from Canadian Institutes of Health Research, the MS Society of Canada, the MS Scientific Research Foundation, the National MS Society, Crohn’s and Colitis Canada, the US Department of Defense, the Arthritis Society, and the Consortium of Multiple Sclerosis Centers (CMSC). Ms Yaeger serves as a statistical editor for Circulation: Cardiovascular Imaging. Ms McFadyen and Dr Salter declare no conflicts of interest.
Note: Supplementary material for this article is available at IJMSC.org.