Single-item athlete self-report measures consist of a single question to assess a dimension of wellbeing. These methods are recommended and frequently used for athlete monitoring, yet their uniformity has not been well assessed, and we have a limited understanding of their relationship with measures of training load.
To investigate the applications and designs of single-item self-report measures used in monitoring team-sport athletes and present the relationship between these measures and measures of training load.
PubMed, Scopus, and SPORTDiscus were searched between inception and March 2019.
Articles were included if they concerned adult athletes from field- or court-sport domains, if athlete well-being was measured using a single-item self-report, and if the relationship with a measure of modifiable training load was investigated over at least 7 days.
Data related to participant characteristics, self-report measures, training load measures, and statistical analysis and outcomes were extracted by 2 authors (C.D. and C.D.).
A total of 21 studies were included in the analysis. A narrative synthesis was conducted. The measures used most frequently were muscle soreness, fatigue, sleep quality, stress, and mood. All measures presented various relationships with metrics of training load from no association to a very large association, and the associations were predominantly trivial to moderate in the studies with the largest numbers of observations. Relationships were largely negative associations.
The implications of this review should be considered by users in the application and clinical utility of single-item self-report measures in athlete monitoring. Great emphasis has been placed on examining the relationship between subjective and objective measures of training load. Although the relationship is still unclear, such an association may not be expected or useful. Researchers should consider the measurement properties of single-item self-report measures and seek to establish their relationship with clinically meaningful outcomes. As such, further study is required to inform practitioners on the appropriate objective application of data from single-item self-report measures.
A variety of approaches are used to apply and analyze single-item self-report measures in team-sport athlete monitoring.
Composite and single-item wellness measures presented various relationships with measures of training load, ranging from no association to a very large association.
Although self-report measures have established value for users in communication facilitation and information disclosure, further research is required to establish their objective clinical utility.
In future work, investigators should use evidence-based considerations to develop (ie, measurement properties) and analyze self-report measures in order to encourage robust and uniform research methods that can inform clinical practice.
In the era of data gathering in the athlete-monitoring continuum, researchers and practitioners in sports science and medicine find themselves in pursuit of meaningful signposts to facilitate athlete progress through optimal workload and recovery while reducing injuries and assembling minute components of competitive advantage. The recommendations for such signposts have focused on an array of objective and subjective measures of training load and recovery.1 Surveying monitoring trends in high-performance sport, Taylor et al2 found that most practitioners (70%) placed equal focus on load quantification and the monitoring of fatigue and recovery; 84% used self-report questionnaires.
Although multi-item self-report measures with published validity and reliability have been considered responsive to training-induced changes in athletes' wellbeing,3 sports programs tend to favor brief, custom self-report measures in practice due to their ease of use, sport specificity, and automation capacity.2,4,5 In a review, Saw et al3 identified subscales that may be of use in monitoring immediate responses to training load and concluded that previous recommendations6–8 for daily monitoring with self-administered measures remained appropriate owing to the need for immediate, daily adjustments to training. However, the relationship between these measures and athlete workload remains unclear.
Athlete self-report measures (ASRMs) in practice often comprise brief, single-item checklists derived from validated questionnaires, symptoms of overtraining, or sport-specific outcomes that are intended to be completed daily. Single item in this context refers to the single-question measurement of an aspect of wellbeing, such as rating general fatigue on a Likert scale, as opposed to a multi-item measure in which several questions may be used to quantify fatigue. Yet selectively combining scales or items from multiple empirical measures negates their established psychometric properties.9 In addition, many of the custom measures used in practice were not based on empirically derived, valid, or reliable scientific evidence.9 Further challenges with the utility of the data obtained using this method include questionnaire fatigue, data accuracy, and practitioner data burden,10,11 particularly concerning the establishment of meaningful change and actionable signposts or “red flags.”9,12
A key challenge for practitioners designing and implementing these measures is the lack of understanding regarding the clinical utility of single-item self-report measures of athlete wellbeing. Clinical utility in sport science and medicine may be described as the relevance or usefulness of an intervention or process.13 Evidence has suggested that ASRMs in practice are used predominantly as status indicators of athlete readiness and facilitators of communication,10,14 but whether and how well these measures respond to training load or reflect recovery or readiness remains unclear. For instance, whether the relationship between athletes' self-assessed wellbeing using single-item measures and workload is strong or weak, linear or nonlinear, negative or positive, or even exists at all is unknown. This may be due, in part, to the heterogeneity of the measures themselves, how they are applied, and the fact that they are frequently developed commercially or designed in-house.
Custom single-item self-report measures are widely used in elite sport2 to measure athlete readiness and the training-load response. As such, exploring their relationship with measures of training load is warranted to inform the design and implementation of future self-report measures in sport. Clarity is needed on the specifications of the measures in use and whether a relationship exists between single-item self-report wellness measures and training load. Because the types of training, applications of monitoring, and athletes' motivations differ between team and individual sports, it is important to consider these studies in isolation.5,10 Therefore, the purposes of our review were to (1) investigate the application and design of single-item self-report measures used to monitor team-sport athletes and (2) examine the association between these measures and measures of training load.
Search Strategy and Study Selection
The design and reporting of this review were conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA).15 This review was preregistered with PROSPERO (CRD42019138105). Criteria for study eligibility can be found in Table 1. The literature search was conducted using the PubMed, Scopus, and SPORTDiscus databases between inception and March 1, 2019. The search strategy for this review is presented in Table 2 and had no applied restrictions. A total of 18 361 articles were found and imported into Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia) for screening. Abstracts were screened for eligibility by 2 authors (C.D. and C.D.), whereas full-text screening was completed by the first author (C. Duignan). A total of 21 studies were included in the review (Figure 1).
Data Extraction and Synthesis
Data were extracted by 2 authors (C.D. and C.D.) and entered in a custom-designed spreadsheet. Data related to participant characteristics (Table 3), self-report measures (Table 4), and statistical analysis and outcomes (see Supplemental Table 1, available online at http://dx.doi.org/10.4085/1062-6050-2020-20.S1) were recorded. A meta-analysis was deemed inappropriate because of the heterogeneity of both the self-report and training load measures used in the included studies; therefore, a narrative synthesis of the literature was conducted.
Assessment of Study Quality
The methodologic quality of the studies was assessed independently by 2 reviewers (C.D. and C.D.) using a modified Downs and Black checklist.37 The original Downs and Black checklist comprises 27 questions that assess the quality of randomized and nonrandomized studies of health care interventions.37 Modified versions of the checklist are commonly used to establish the quality of observational studies in sports science and medicine.38 We deemed appropriate and retained 13 questions (items 1, 2, 3, 5, 6, 7, 9–12, 16, 18, and 25). Discrepancies in quality scoring were discussed among 3 authors (C.D., C.D., and C.B.), and a consensus was reached. Full-quality scoring is presented in Supplemental Table 2.
The 21 studies in the review involved 6 sports and 500 participants (soccer = 181, American football = 159, Australian football = 87, rugby sevens = 48, field hockey = 12, volleyball = 13). Assessment durations varied from 8 days to 36 weeks and included undefined season-long monitoring (Table 3). Most researchers described the self-report measure as evaluating wellness; thus, we used this term to present the results collectively. Associations between wellness measures and training load measures have been summarized in the text and are presented in Supplemental Table 3.
The study quality assessment is presented in Supplemental Table 2. In 12* of the 21 studies, the athletes' sex(es) was (were) not stated. Where sex was known (n = 176), female athletes were underrepresented in the sample (n = 29, 16%). Principal confounders of the research in this review were factors such as player position, fitness, ambient temperature, and training type. Six investigations16,20,26,32,35,36 partially accounted for the distributions of these factors, whereas 7 studies16,17,21,26–28,35 accounted for confounding factors in their data analysis. None of the authors reported the required measure of random variability (interquartile range) for the ordinal data of their self-report measures; however, 1 study25 was awarded a score for this question because the self-report measures were deemed not to be a main outcome. The statistical tests used were determined to be appropriate in all cases. A total of 15 groups16,20–25,27,30–36 described participants lost to follow-up; this score was not awarded when the reason for missing data was not reported.
The number of questions ranged from 3 to 8, and 5-, 7-, or 10-point Likert scales were used. All single-item variables were related to physical and psychological health except in 1 investigation23 in which the desire to train was a variable; individual results for this variable were not provided. The most commonly used measures were muscle soreness (n = 20), fatigue (n = 20), sleep quality (n = 19), stress (n = 14), and mood (n = 6). A total of 14 studies provided or used a composite measure, 10 of which16–20,24,27,28,30,31 were summed wellness scores and 4 of which21–23,32 were averaged scores. Six other variables were assessed (motivation to train, sleep quantity, time of sleep, mental fatigue, energy, and perceived recovery). When the direction of the scale was evident (n = 18), a higher score reflected better wellness in 11 studies,16–18,23,24,28,32–36 and a lower score represented better wellness in 7 studies.19,20,25,27,29–31 The authors of 12 investigations† cited a 1995 review by Hooper and Mackinnon6 in describing the design of their self-report measure: 4 sets of authors17,18,20,31 labeled the summed wellness score the Hooper Index, and 1 set of authors30 labeled it the Hooper Score (Table 4).
Thirteen studies explored (1) the relationship between pretraining wellness and subsequent load output (ie, same day)16,18,21,24,26–28 and (2) the relationship between training load and subsequent wellness (ie, next day)18,22,32–36 ; whereas in 8 studies,17,19,20,23,25,29–31 the combination of data used for the correlation analysis (ie, whether they used same-day versus next-day data) was not clear. Researchers used a variety of derived variables, true values, changes in values and z scores, means and group means, normalization, and fixed and random effects in their analyses. Results were presented as a mix of significance testing, effect sizes and correlation values, and interpretations of magnitude and subsequent inferences that were mostly based on the work of Hopkins et al,39,40 Batterham and Hopkins,41 and Hopkins42 (see Supplemental Table 3). Figure 2 illustrates the magnitude of the correlations as interpreted in the original studies (Pearson and Spearman analyses only) between single-item and composite wellness scores and training load on the basis of the number of observations used in the original studies. If the number of observations was not stated in the original study, the maximum number of potential observations was estimated by multiplying the sample size by either the number of sessions or duration of the study. Investigations are grouped by analysis type for the results summaries, whereas Supplemental Table 3 provides further details.
Composite Wellness Scores: Summed and Averaged
Associations between summed wellness and training-load measures were reported in 8 studies (see Supplemental Table 3). Pearson and Spearman correlations revealed small to moderate,17,18,30 large,19,20 and very large27 positive and negative associations between summed wellness scores and measures of training load. Two groups used a summed wellness score analyzed with linear mixed models. Govus et al24 found that a 1-unit increase in wellness z score was associated with a trivial increase in player load, and Malone et al28 found trivial to very likely negative effects of the wellness z score on training output measures.
An average wellness score was computed in 4 studies. Gallo et al21,22 calculated linear mixed models and demonstrated no effect of days postmatch or match load on the weekly wellness profile22 and predominantly trivial effects of the wellness z score on training output variables.21 Using the Spearman correlation, Sampson et al32 identified a trivial negative association between the wellness z score and the acute:chronic workload ratio on the previous day. Using the Pearson correlation, Gathercole et al23 reported a small negative correlation between wellness and training impulse for a 7-day average.
Individual Wellness Variables
Associations between muscle soreness and training load measures were reported in 14 investigations. Pearson and Spearman correlations varied from trivial17,32 to small,18,31 moderate,16,29,30 and large20 negative and positive associations between measures of muscle soreness and measures of training load.
Applying linear mixed models, Thorpe et al33,34 found no correlation between training load and muscle soreness. Wellman et al35,36 used categorical scores to determine that players who rated muscle soreness higher had greater training loads than those who scored lower. Govus et al24 reported that muscle soreness was not related to player load or session rating of perceived exertion (sRPE); when the variables were modelled individually, a 1-unit increase in muscle soreness corresponded with a trivial decrease in sRPE. Conversely, Henderson et al26 described increased muscle soreness as associated with a trivial increase in the physical performance factor value.
Associations between measures of fatigue and training load were reported in 13 studies. Pearson and Spearman correlations yielded results that varied from no correlation29 to small,16–19 moderate,25,30 large,20 and very large31 negative and positive associations between measures of fatigue and measures of training load.
Using linear mixed models, researchers found a large negative33 association between total high-intensity running and fatigue and small to moderate associations between fatigue and cumulative days of high-speed running distance.34 Based on categorical values, Wellman et al35,36 determined that players who rated fatigue higher had greater training loads than those who scored lower, whereas fatigue was not included in the best-fit model of Henderson et al.26
The association between sleep quality and training load was provided by 13 groups. Pearson and Spearman correlations indicated no associations25 to trivial,17 small,16,18,30 large,20 and very large31 negative and positive associations between measures of sleep quality and measures of training load.
In 3 studies using linear mixed models, researchers found no associations24,33,34 between measures of training load and sleep quality. Wellman et al35 demonstrated no differences in training load related to sleep quality, but Wellman et al36 noted differences between maximal-intensity deceleration distance and sprint distance for players who rated sleep quality as different on specific days of the week. Sleep quality was not included in the best-fit model of Henderson et al.26
The association between stress and training load was reported in 9 studies. Pearson and Spearman correlations displayed results that ranged from no correlation18 to small,16–18 moderate,30 and large20 positive and negative associations between measures of stress and training load.
Authors26 using linear mixed models reported that increased stress was associated with a trivial increase in the physical performance factor value, whereas Wellman et al35,36 used categorical values to show that those who perceived less stress had lower training loads than those who perceived more stress.
Three sets of investigators depicted the association between mood and training load. Buchheit et al16 used the Pearson correlation and observed a small positive association. Wellman et al35 used linear mixed models and found that those with more favorable mood scores had lower training loads than those with less favorable responses. However, Wellman et al36 noted no differences in movement variables for measures of mood.
Associations between training load and other individual variables were reported in 5 studies. Using Spearman correlations, researchers detected no association between training load and sleep quantity across the season29 and a trivial negative association between energy and the acute:chronic workload ratio.32 In addition, no correlation was present between training data and the measures of mental fatigue or sleep duration.25 Govus et al24 used linear mixed models and reported that pretraining energy was trivially positively related to player load and not related to sRPE. Henderson et al26 indicated that a higher level of perceived recovery was associated with a trivial decrease in the physical performance factor value.
Our aims in this review were to identify the applications and designs of single-item self-report measures used to monitor team-sport athletes and present their relationship with measures of training load. Predominant findings were a paucity of evidence-based practice in the adoption and design of single-item self-report measures beyond general recommendations, a variety of data-collection and -analysis techniques, and a spectrum of no correlation to very large correlations with measures of training load. We provided an in-depth critical analysis of the use of single-item self-report measures in team sports and highlighted the lack of quality and rigor in how these measures were approached and used.
Readers should be vigilant in interpreting the results regarding the direction of the scale (ie, higher or lower scores reflecting better wellness) with respect to positive or negative relationships with workload, the direction of the proposed effect, the type of load measure (ie, internal sRPE or external load measures), and the analytic approaches used. Heterogeneity of the approaches made the results challenging to synthesize with clarity (Figure 3).
Self-Report Measure Overview
A concern regarding the use of single-item self-report variables was the lenient interpretation and citation of previous recommendations. This was frequently related to the work of Hooper and Mackinnon6 and Hooper et al.43 Whereas the authors of 12 studies‡ referenced a 1995 review by Hooper and Mackinnon6 for their chosen self-report measures, most of these researchers used a design closer to the questionnaire used by Hooper et al43 in an original paper from the same year rather than the recommendations in the cited review (ie, the original study43 involved a 7-point scale of 4 items, whereas in the review, the authors6 recommended a 5-point scale of 7 items). This included, in some cases, the construction of a Hooper Index or Hooper Score (also referred to as the Hooper Scale in other studies); however, in the referenced studies, the authors suggested neither the terms nor the summing method. This “naming” alludes to the idea of a validated, formative construct and was subsequently cited by others.
Relationship Between Measures of Wellness and Measures of Training Load
The existence, size, and direction of a relationship between single-item self-report measures and measures of training load varied for both composite and individual measures, creating a challenge in drawing inferences or conclusions from the results. The existence of a relationship appeared more prominent in studies that used correlation measures than in those that used linear mixed models, potentially because the mixed models accounted for the correlation within repeated measures for each athlete. This was most notable in Ihsan et al,27 who normalized measures to time and RPE, but this investigation had one of the smallest numbers of observations (maximum of 72) of the included works.
Whereas Saw et al3 found the ASRM to be responsive to acute and chronic training loads, they used a bespoke method of measuring sensitivity and consistency that was not replicable. In addition, the self-report measures used in the review of Saw et al3 were typically retrospective; they often referred to a time period in the previous week (eg, profile of mood states) or the previous 3 days (eg, recovery stress questionnaire for athletes), in contrast to the predominantly daily measured single-items evaluated here. They identified the vigor/motivation, physical symptoms/injury, nontraining stress, fatigue, physical recovery, general health/wellbeing, and “being in shape” subscales as useful in their review; only 1 subscale (ie, fatigue) directly overlaps with the measures in our review.
One of the most challenging aspects of researching the use of self-report measures of athlete wellbeing and, by extension, conducting a narrative synthesis of this literature, was the tendency for observations to be made in an uncontrolled “real-world” environment. Authors of 4 studies17,18,28,30 explicitly stated that the measures used were part of the normal team monitoring routine, whereas authors of 17 studies16,19–27,29,30–36 did not specify whether the methods existed previously or had been implemented for research purposes. Heterogeneity among the included studies was further demonstrated by the variety of inclusion and exclusion strategies, including the requirement or “happening” of full participation (n = 6),16,20,30,31,33,34 minimal training or competition participation requirements (n = 5),17,18,22,25,36 the exclusion of goalkeepers (n = 2),20,29 and the requirement for ordinal self-report data to be normally distributed (n = 2).21,28 Investigators in a further 7 studies19,23,24,26,27,32,35 did not reference any requirements for the inclusion or exclusion of participants or data. In only 1 study did the authors25 comment that interim results from data collection were not provided to the coaching staff. It was unclear in all other cases whether the self-report data were or may have been used by practitioners to manipulate the training design or participation throughout the study, thereby creating interdependency between consecutive observations. Caution must be applied in making recommendations on the basis of such works.44 In light of these factors, we echo the recommendation of Fullagar et al,45 who suggested that studies such as these that lack experimental control should be presented as case reports and authors should resist making inferences beyond general observations.
Whereas research requires controlled experiments to establish confidence in a result, one can also argue that controlled, laboratory-based tests do not transfer to the real-world environment. If the isolated relationship between workload and ASRM data is clinically relevant, then we require controlled data collection to identify whether a relationship exists and, if so, its underlying nature. However, in a real-world scenario, these measures are unlikely to be influenced by training load in isolation, given that they will also be determined by factors such as recovery and psychological and social influences. This indicates that these measures may be more reflective of complex “readiness” than a linear training “response.” Yet defining an optimal state of readiness continues to be elusive; authors of the studies included in this review who explicitly evaluated the effect of pretraining wellness on subsequent training output found predominantly trivial results.
Considerations for Practice
The existence and nature of any relationship between training load and wellbeing create an interesting debate for practitioners seeking to use these outcomes to inform their clinical decision making. For example, the clinical importance of a specific summed wellness z score of −1 corresponding to a difference of 4-m sprint distance in training28 is unclear. Whether these small correlations and magnitude-based inferences of heterogeneous self-report data can adequately inform the design of future measures and whether they justify the appropriateness of a single-item ASRM in its currently recommended form must be considered. Readers should also be aware of the variety of scoring, summing, and averaging techniques used to analyze the self-report measures in this review when they consider the adoption or are planning the design of single-item measures in practice. Saw et al9 outlined recommendations for developing custom ASRMs to ensure acceptable psychometric properties; however, readers may also take into account design requirements relative to their needs. For instance, if the primary role of the ASRM is facilitating communication and prompting information disclosure, a simple approach may suffice.10,14
With regard to current practice, investigators have suggested that data from self-report measures are indeed used predominantly as status indicators and facilitators of communication rather than as decision-making tools14 and that their value is predicated on athletes' honesty and practitioners' interpretations.10 Although many practitioners seek clarity and simplicity in athlete monitoring, the complexity of the relationship between wellness and training load and the difficulties associated with their measurement necessitate substantial investment of human resources to enable the best use of an ASRM. Indeed, Hooper and Mackinnon6 proposed that the use of these self-analysis tools to identify trends toward overtraining syndrome depended largely on interpretation by a coach. Furthermore, many guidelines that inform the design and implementation of single-item ASRMs have been based on studies that used these measures for monitoring overtraining and, as such, narrow Likert scales may not be sensitive to smaller day-to-day variations in the athlete's state. Further research is required to apply clinically meaningful and individual “traffic-light” approaches with ordinal scales, such as those used in ASRMs.12 These factors are also important to recognize when considering the potential value of introducing a self-report system in a sports program, given the challenges associated with implementation, adherence, dishonesty, resource investment, and system factors.10,46
Review Limitations and Future Work
Whereas our review provides insight into this complex area of sports medicine research, a number of limitations should be acknowledged. First, the possibility that studies identifying correlations between self-report measures and workload were more likely to be published than those not identifying such correlations was high but also difficult to detect. A funnel plot and its associated statistics are typically used in intervention research to elucidate such bias; nonetheless, this was not possible due to the heterogeneity of the studies (and their associated primary and secondary outcomes). Publication bias, if it exists, may result in overestimation of the strength of the relationship between the measures evaluated. Next, study heterogeneity, including the variations in the self-report and training load measures used, the different frequencies and durations of administration, and the alternative statistical analysis approaches made synthesizing this literature particularly challenging and drawing definitive conclusions impossible. These limitations highlight the need for further research and more critical consideration in the design, use, and analysis of self-report measures in research and practice beyond interpreting general recommendations for athlete self-appraisal.
In addition, we were unable to identify long-term relationships between single-item ASRMs and training load measures because the included studies focused primarily on daily analyses. In future evaluations, investigators may use single-item self-report data to represent different stages of the season or binary measures, such as changes in the days posttraining or competition. Researchers may also investigate the ability of single-item self-report measures to reflect athlete readiness or recovery, although defining such a state of readiness is a subject for debate.
Team-sport programs are interpreting general recommendations in their use of single-item self-report measures, which predominantly feature muscle soreness, fatigue, sleep quality, stress, mood, and often a composite score. Studies with the largest numbers of observations showed predominantly trivial to moderate associations between single-item self-report measures and measures of workload. Where associations were found, the directions of the relationship were predominantly negative (ie, when training load increased, wellness decreased). Although the nature of the relationship is still unclear, such an association may not be expected or useful. Self-report measures have established value for users in sport programs through certain aspects, such as communication,10,14 yet further assessment is required to establish their clinical utility beyond the role of complementary tool. Future authors should consider the measurement properties of single-item self-report measures and establish their relationship with clinically meaningful outcomes. The potential may exist for such measures to be incorporated in predictive risk tools in conjunction with established risk factors; however, further research is required to inform practitioners on the appropriate objective application of data from single-item self-report measures in team sport.
This work was supported by funding from Science Foundation Ireland (Drs Duignan and Caulfield) under the grant for the Insight Centre for Data Analytics (SFI/12/RC/2289_P2) and the Health Research Board (Dr Doherty) under grant No. ARPP-A-2018-002.