The King-Devick (KD) test is a screening tool designed to assess cognitive visual impairments, namely saccadic rhythm, postconcussion. Test-retest reliability of the KD in a healthy adolescent population has not yet been established.
To investigate the overall test-retest reliability of the KD among a sample of healthy adolescents. Additionally, we sought to determine if sex and age influenced reliability.
Sixty-eight healthy adolescents, 41 boys (age = 15.4 ± 1.9 years) and 27 girls (age = 15.4 ± 1.9 years).
Participants completed the KD (version 1) at 3 testing sessions (days 1, 30, and 45) following standard instructions. We recorded total time to complete the reading of 3 cards for each participant during each testing session. Two-way random-effects intraclass correlation coefficients (ICCs) using single measurements repeated over time and repeatability coefficients were calculated. Linear mixed models were used to determine whether differences existed at each testing time and to examine whether changes that took place among visits were different by sex or age.
Adolescents who completed the KD demonstrated acceptable reliability (ICC = 0.81; 95% confidence interval = 0.73, 0.87); however, the repeatability coefficient was large (±8.76 seconds). The sample demonstrated improvements between visits 1 and 2 (mean ± standard error = 4.3 ± 0.5 seconds, P < .001) and between visits 2 and 3 (2.4 ± 0.5 seconds, P < .001) for a total improvement of 6.9 seconds over 3 tests. No significant visit-by-sex or visit-by-age interactions were observed.
Despite the ICC being clinically acceptable, providers using the KD test for serial assessment of concussion in adolescents should be cautious in interpreting the results due to a large learning effect. Incorporating multiple measures can ensure accurate detection of sport concussion.
The King-Devick (KD) test demonstrated good reliability across 3 testing sessions in a healthy adolescent population.
However, clinical interpretation of the KD test should take into consideration that a large learning effect occurred in a healthy population across 3 test sessions.
Clinicians may include the KD test in a postconcussion evaluation, but multiple measures should be administered to ensure accurate detection of sport concussion.
The scientific community and society are being called on to think differently in regard to sport concussion (SC) in youth.1 Sport concussions have become a major topic in the medical community and media over the past 20 years, with estimates of 3.8 million concussions occurring each year in the United States.2 Over time, epidemiologists have reported SC frequencies of as low as 3% to 5% of athletic injuries and as high as 18% to 26% depending on the sex, sport, and whether the child was participating in practice or competition.3−8 Unfortunately, this estimate does not take into account the 50% to 56% of concussions that go unreported, particularly in collision sports such as football and hockey.9−12
A concussion represents a traumatic, acute event that can create long-term consequences for the adolescent and family, particularly if the condition is unrecognized, misdiagnosed, or mismanaged.13−16 Therefore, it is critical for clinicians to respect the array and variation of both the symptoms and the physical presentation of the neurometabolic cascade after a concussive event.17,18 The physical presentation almost always includes 1 symptom or a combination or clustering of symptoms. Common practice is for clinicians to question injured adolescents in regard to their symptoms to identify key indicators of a possible concussion.18−22 Symptom inventories have become a standard in helping clinicians determine not only which symptoms an adolescent may be experiencing but also the severity of those symptoms. However, although symptom inventories are helpful in a concussion evaluation, they have limitations. First, symptoms do not always appear immediately, and they may not fully account for all the physical impairments that exist after an injury.20 For example, injured athletes have demonstrated balance deficits in cases even when they have not described impaired balance as a symptom or they reported being symptom free; thus, balance testing should be a part of postconcussion testing.23−25 Also, athletes are not always truthful in reporting symptoms for fear they will not be allowed to continue to participate or return to participation.9,11,12 Therefore, researchers and clinicians have brought to light the need for multiple objective assessment measures beyond self-report symptoms to describe the full sequelae of functional impairments after a concussion. Neurocognition, vestibulomotor, and oculomotor changes may be present after traumatic brain injury and should be addressed in the evaluation process.26−28
To accurately identify the deficits an adolescent experiences after a concussion, the evaluation and diagnostic process should be both comprehensive and objective from the sideline assessment through the final evaluation for return to play. A criterion standard has yet to be developed for diagnosis or treatment; however, experts support the use of multiple objective tools as the most beneficial approach to measure physical and functional deficits in symptoms, neurocognition, and balance.29 In addition to being accurate, tools need to be accessible, easy to deliver, and administered by the most qualified professional available. Some commonly available instruments within a testing battery might include self-report symptom scales, the Sport Concussion Assessment Tool 3 (SCAT3) or Child SCAT3, and computerized neurocognitive assessment tools such as Immediate Post-concussion Assessment and Cognitive Testing (ImPACT) and the Balance Error Scoring System.21,22,30−33 Another recognized domain for assessment is oculomotor function, as it is often compromised after a concussion.27,34,35
Oculomotor function is a complex process controlled by important sensory systems. Deficits related to oculomotor dysfunction posttraumatic brain injury include saccades, antisaccades, smooth pursuit, vergence, accommodation, vestibulo-ocular reflex, and nystagmus.36 Saccadic rhythm has been the focus of studies examining additional evaluation strategies postconcussion.31,32,37,38 Ocular saccades are regulated by the dorsolateral prefrontal cortex, an area of the brain that is susceptible to disruption as a result of traumatic head injury.36 Saccades are rapid eye movements required to focus on objects.39 After a concussion, directional errors, poor spatial accuracy, and prolonged latencies can affect saccades; these changes are related to deficits in executive function, attention, and memory.27,34,36 These deficits can create functional and lifestyle impairments in patients. Heitger et al34 compared postconcussion syndrome (PCS) patients with a healthy cohort and found several oculomotor trends: the PCS patients produced more final eye-position errors, slower peak velocities, and larger final amplitude errors in memory-guided sequences. Smooth pursuit, which is controlled by the cerebellum and prefrontal cortex, can require saccadic movements.36 Because of these changes in physiological function and saccades, saccadic rhythm can be assessed to potentially detect a concussion. A clinical tool that was introduced to evaluate saccadic dysfunction, particularly in a sideline assessment, is the King-Devick (KD) test.40 The KD test was originally designed in the 1970s as a screening instrument to identify adolescents with learning disabilities and reading fluency concerns caused by changes in saccadic rhythm (ie, dyslexia).41−45 Later, results on the KD test were used as an outcome measure for sleep-deprivation studies as well as a potential screen of reading ability in kindergarten and first-grade students.44,46 Because saccadic rhythm may be disrupted after traumatic brain injury, the KD test can add valuable physical data to the concussion-assessment protocol.35,47,48 Slower times or more errors on the test (or both) after a traumatic force to the head and brain may support the diagnosis of a concussion.
Researchers31,32,37,49,50 have examined the efficacy of the KD test as a sideline-assessment tool, particularly in collegiate and professional athletes. However, investigations of the KD test for concussion assessment in an adolescent population are limited.38 Early authors38,42,44,51 examined the test-retest reliability of the KD test in a pediatric population to determine its efficacy for detecting reading disability and measuring reading improvement over time; however, test-retest reliability data were inconclusive, as a pronounced learning effect was described. Currently, we are unaware of any studies specifically examining the test-retest reliability of the KD test in a healthy adolescent population. Because the KD test has been suggested for use in a preinjury-postinjury assessment model, it is essential to establish test-retest reliability in a healthy population over time to determine clinical utility. Strong test-retest reliability of the KD in a healthy population would ensure that findings in an injured population are related to the condition rather than to systematic or random error.
The purpose of our study was to describe the test-retest reliability of the KD test over a clinically relevant timeframe in a healthy adolescent (12−18 years old) sample. Satisfactory test-retest reliability over clinical timeframes will assure clinicians who are using the KD to assess impairments after concussions that identified impairments are due to the condition and not to poor reliability of the test itself. We hypothesized that the KD test would have clinically acceptable or good test-retest reliability (reliability coefficient > 0.75) and would therefore be an effective and practical sideline and return-to-play concussion-assessment tool in adolescents.21,52 We also sought to examine the effects of sex and age on the test-retest reliability of the KD. If the KD demonstrated strong test-retest reliability, it could supplement the battery of tests used after concussions in adolescents to objectively measure oculomotor dysfunction. Findings from this study could also add to the literature on evidence-based SC assessment and management strategies in an adolescent population.
The study received approval from the human subjects review board at South Dakota State University. Written parent or guardian permission and student assent were obtained before data collection. Test-retest dates were determined in meetings between the principal investigator (T.J.O.) and athletic director of each participating high school.
The principal investigator met with school administrators from the 3 school districts to request permission to recruit their student-athletes. Once approval to recruit students from the school district was granted, the principal investigator and administrators established a process to inform students and parents of the study, invite them to participate, and schedule potential dates and times for testing.
Male and female student-athletes with no history of learning disabilities and ranging in age from 12−18 years were recruited for participation. Student-athletes were the most relevant population as they were at risk of sustaining concussions from participation in sport and therefore would receive follow-up testing with the KD test. To avoid bias in the selection process, we invited all student-athletes in each of the 3 school districts to participate. All adolescents who provided completed permission and assent forms and met the inclusion criteria were eligible for participation.
Before the study, we recruited and trained 5 testers according to the directions for the KD test. The KD test was delivered via paper cards and hand timers rather than the electronic or tablet format; therefore, testers were trained in the former method. Testers watched a KD training video and were given the opportunity to perform the test on themselves. During actual testing, the examiners read only the language of a prepared statement.
Each participant completed 3 testing sessions: test 1 (baseline), test 2 (30 days postbaseline), and test 3 (45 days postbaseline). The timeframe of these sessions represented how the KD test may be delivered to collect data important to clinical decision making, such as confirming a concussion diagnosis or when to make changes in managing a concussion. At each testing session, version 1 of the KD test was administered following the standard directions for baseline and postconcussion testing.53 The initial day of testing represented a baseline test, so the KD was delivered twice using the scoring instructions, and the fastest total time without errors of the 2 attempts was recorded as the baseline score. On days 30 and 45, students were allowed only 1 attempt; these 2 testing dates represented the use of the KD during postconcussion testing sessions. Their total time to complete the 3 cards as well as any errors were recorded.
Collected data included the sex, age, recorded time, and error count for each participant. The KD test final score is based on elapsed time to take the test and number of errors.
The data were analyzed using STATA (version 12; StataCorp, LLC, College Station, TX). A 2-way random-effects intraclass correlation coefficient (ICC) using single measurements repeated over time was calculated to determine the reliability between testing sessions for all adolescents. We interpreted test-retest reliability for clinical use as poor (ICC reliability coefficient < 0.50), moderate (ICC reliability coefficients from 0.50−0.75), or good (ICC reliability coefficients > 0.75).52 Repeatability coefficients were calculated using methods described by Bland and Altman.54 The within-participant standard deviation (SDw) was calculated as the square root of the residual mean square obtained using 1-way analysis of variance with participant as a factor variable. Using the SDw, repeatability coefficients (CRs) were calculated as . The CR indicates the range in which the results of 2 tests using the same method will fall for 95% of participants.55 Finally, linear mixed-effects models were used to determine if time to complete the instrument was different among visits. Within the linear mixed-effect models, visit-by-sex and visit-by-age interactions were tested in models to determine if changes in time to complete the instrument differed by sex or age. Statistical significance was set at α = .05.
Seventy-four adolescents met the inclusion criteria and 68 (41 boys, 27 girls) completed all visits, for a 92% retention rate. Dropouts were due to adolescents not attending school on a follow-up testing date because of illness (2 students), missing the test due to previous obligations of college tours (2 students), or voluntarily discontinuing participation (2 students). Participant characteristics are provided in the Table.
In our entire sample, the ICC revealed good test-retest reliability from day 1 to day 45 (ICC = 0.81 [95% confidence interval = 0.73, 0.87]); however, time to complete the KD test decreased at each visit (mean ± SD = 51.8 ± 9.9 seconds on day 1, 47.5 ± 9.8 seconds on day 30, and 45.1 ± 9.1 seconds on day 45; P < .001). The number of errors at baseline was negligible and did not vary by visit. The CR for our data was 8.76 seconds, indicating that, if a repeat test was administered, we would expect the result to fall within ±8.76 seconds for 95% of the population.
No difference in time to complete the test was demonstrated between sexes (girls: 52.4 ± 2.3 seconds, boys: 51.4 ± 1.3 seconds; P = .70). Improvement in time to complete the test was also similar in both sexes (visit-by-sex interaction P = .30), as boys and girls each showed significant improvements in times at each visit (Figure 1). Girls improved by 4.8 ± 0.9 seconds from test 1 to test 2 and by 7.2 ± 0.99 seconds overall from test 1 to test 3, while boys improved by 3.3 ± 0.8 seconds from test 1 to test 2 and 6.1 ± 0.8 seconds overall from test 1 to test 3.
Age can also affect the time required to complete the KD test. At baseline, adolescents 15 years and younger took longer to complete the test (54.6 ± 1.7 seconds) than adolescents 16 and older (48.8 ± 1.6 seconds; P = .01); however, in the mixed-model analysis, this effect appeared to be similar for both age groups (visit-by-age interaction P = .30) as adolescents ≤15 years old and >16 years old both demonstrated significant improvements in times over the 3 testing days (Figure 2). In younger participants, time to complete the test improved by 3.8 ± 0.7 seconds from visit 1 to visit 2 and by a total of 6.5 ± 0.7 seconds. In the older participants, time to complete the test improved by 4.8 ± 0.7 seconds from visit 1 to visit 2 and by a total of 6.9 ± 0.7 seconds.
Using multiple instruments to detect the array of functional changes occurring postconcussion, particularly as part of a sideline assessment, has become a supported practice.26,35,40,56 Evaluating changes or deficits in oculomotor function should be considered as part of the concussion-assessment process.35,40,47 The KD test is designed to screen changes in oculomotor function, namely saccadic rhythm, and has been introduced as a potential addition to the tests currently available to determine if an athlete has sustained an SC (ie, SCAT3/Child SCAT3 and Balance Error Scoring System). A key advantage of the KD test is its ease of delivery, both in the collection of baseline data and as a rapid assessment tool postinjury. With practice, health care providers and in certain cases laypersons (eg, parents and coaches) can learn how to deliver the test. One group57 examined the ability of laypersons (ie, the parents of a cohort of amateur boxers) to adequately deliver the KD test. With practice, laypersons as well as health care providers can effectively administer the test with strong test-retest reliability. Even though this study examined laypersons' ability to conduct the test after an athlete's potential head injury, caution should be exercised in allowing laypersons to interpret the findings. Perhaps the best use of a layperson's ability to collect KD test data is to assist youth sport organizations with baseline testing, which can supply valuable information to health care providers who conduct the postinjury KD test in their decision-making process. Although research has begun to emerge regarding the usefulness of the KD test in detecting oculomotor impairment postconcussion, the instrument has largely been investigated in an injured population and focused on cohorts of pre-adults or adults. The purpose of our study was to examine the test-retest reliability of the KD test in a healthy adolescent cohort to assess the instrument's usefulness in an injured population.
Our ICC suggests that the KD test, when delivered similarly across 3 testing sessions (as per the standard test instructions), is a reliable measure; however, our participants' times (regardless of sex or age) improved significantly and resulted in a CR of 8.76 seconds over the 3 testing sessions. As stated earlier, this indicates that the results of repeated KD tests can be expected to fall within ±8.76 seconds for 95% of the population. Therefore, clinicians should be cautious, especially when using this test to rule out impairment. Our findings are in close agreement with those of early researchers who found a lack of strong test-retest reliability and an impressive learning curve in healthy adolescents less than 12 years old. Oride et al42 investigated the reliability of the KD test in 63 children aged 7−12 years old. The participants completed the KD test at a baseline session and then again 2 weeks later. They demonstrated a mean increase of 9.7 ± 11.9 seconds at the second testing session. The researchers concluded the KD test may be useful as a screening tool for determining reading deficiencies; however, they questioned its use as a diagnostic measure and did not consider it an appropriate tool to monitor the progress of patients undergoing oculomotor therapy. Our cohort demonstrated a mean decrease of 6.9 seconds between days 1 and 45. Although our sample did not demonstrate as much improvement, we allowed more time between test sessions, and our sample was older (12−18 years).
The KD test has reemerged as a potential sideline-assessment tool to aid in detecting SC. Several investigators have examined the performance of the KD test as a sideline-assessment tool, particularly in collegiate and professional athletes. Although no single research study has described test-retest reliability in an adolescent or adult population, several groups examined reliability as part of a larger work. More recent evidence31,37 suggests better test-retest reliability in an adult sample (>22 years old), thereby indicating that perhaps developmental patterns of saccadic rhythms become more constant with age and more specifically through the adolescent years. In 2011, Galetta et al37 investigated the use of the KD test as a rapid screening tool in a sample of 39 boxers and mixed martial arts fighters (mean age = 24 years [range, 16−53 years]). To establish the test-retest reliability of the KD test (as no other published test-retest reliability data existed for this age group), they had the fighters complete the KD test twice (in a healthy state) before undertaking a 9-minute sparring bout. The 2 prefight test sessions were administered 15 minutes apart. They noted slightly lower means for the second KD test (improvement); however, the ICC showed good test-retest reliability (ICC = 0.97, 95% confidence interval = 0.90, 1.0) between the testing sessions. In a second study, Galetta et al31 examined test-retest reliability in 219 collegiate student-athletes who completed tests at the beginning of their sport season (baseline) and during the postseason. Improvements in time were detected from baseline to postseason (37.9 seconds versus 35.1 seconds; P = .03 via the Wilcoxon signed-rank test), suggesting questionable test-retest reliability or mild learning effects (or both). The test-retest findings for the KD test are similar to our results, although adolescents appeared to improve more than their older counterparts. In addition to examining the test-retest reliability of the KD test in healthy participants, Galetta et al31 examined the use of the instrument as a screening tool postconcussion. In both cohorts (the fighters and collegiate athletes), the researchers noted a significant increase in time to complete the KD, as well as an increase in errors (worse performance), findings that were consistent with other markers of concussion used as criterion standards, regardless of the questionable test-retest results in a healthy population.
Ultimately, providers using the KD test for initial detection and serial reassessment of concussion in youth need to decide how learning effects in a healthy population influence the interpretation of postconcussion results. Most recently, Galetta et al38 included a cohort of youths (n = 243, aged 5−17 years old) in a study designed to examine the KD test as a complementary tool to the Standard Assessment of Concussion and balance testing in a sideline assessment for adolescents and collegiate athletes. In comparison with the adolescents in our cohort, the adolescents in their study completed the KD test at a preseason baseline assessment in a mean of 60.6 ± 22.3 seconds. A child who sustained a concussion was assessed using the KD test along with the Standard Assessment of Concussion and timed tandem gait. Among all individuals who incurred a concussion (n = 12; adolescents were not separated from collegiate athletes), time to complete the KD test worsened significantly. Concussed athletes took longer to complete the KD test by 5.2 seconds (range, –12.7 to 42.7 seconds) as compared with nonconcussed matched controls (improved 6.4 seconds). Our findings revealed a similar improvement in time to complete the KD test as displayed by the matched controls (6.7 seconds over 3 testing sessions). Use of the KD test postconcussion to reveal impairments after a concussion may be predicated on the fact that a child is expected to demonstrate significant improvement if he or she is healthy; therefore, slower times compared with baseline (>5−6 seconds), with or without the addition of errors, may be interpreted as reason to suspect a concussion or other head injury.
Adding to the strength of our study were design features such as the inclusion of interrater training sessions, diligence in notifying participants and scheduling retest sessions in advance, and ensuring a consistent delivery environment for the KD across locations. Before data collection, each tester underwent a 20-minute orientation on the KD test, which included (1) the importance of reading the instructions to the participants in a consistent manner, (2) learning how the KD is scored, and (3) practicing a minimum of 3 times on each other. Testers were able to ask questions to ensure they understood the delivery process. Test sessions were predetermined once the initial dates were selected. Students were informed about their next test session at the end of each session, and school administrators were given a list of students who were scheduled to retest a day in advance. This allowed us to test students at exactly 35 and 40 days, with a 92% retention rate of initial participants. Finally, some variation existed in the building locations where the testing took place; however, each space was quiet and offered proper lighting for the testing process. A supervisor monitored the testers to ensure proper administration of the KD test.
The major limitation of this study was the small number of students who chose to participate relative to the number that were recruited. Only 74 of 300 potential student-athletes who were invited chose to complete the pretest, and 68 participants completed all 3 testing sessions. This low recruitment rate was surprising, but we were able to achieve 81.5% power to detect differences of 4.3 seconds at α = .05 with a smaller than expected sample. Although our sample was either larger than or consistent with the samples of other research studies and a 92% retention rate is admirable, a larger sample would have provided stronger support for the overall findings as well as sex and age stratification.
Established learning effects demonstrated by improvements in time to complete the KD test in a healthy sample of participants (adolescents and adults) have been reported by other investigators.31,42 Our findings in a healthy adolescent population are consistent with these results. Improvements in time to complete the KD test in healthy youths may be attributed to being exposed to the same version 3 times within a 45-day period but may also be related to familiarization with instructions and testing procedures. At each testing session, participants were more familiar with not only the actual test but the instructions. As did previous researchers, we controlled for exposure to the KD test between testing sessions. Students were given no other restrictions for school or activities of daily living; they continued to participate in reading, other school-related activities, and activities of daily living and continued their saccadic rhythm patterns and other oculomotor functions. We did not train these functions, but we did not limit them either. There would be no reason to believe their patterns would have deteriorated between sessions. One recommendation for future study is to determine if delivering version 1 of the KD at baseline and version 2 at a second point in a healthy population improves test-retest reliability over time.
Clinicians using the KD test to screen for SC in adolescents should consider how potential learning effects influence test-retest reliability as demonstrated in our healthy adolescent sample. When delivering a test or tool to detect and identify an injury, condition, or illness, clinicians should feel confident that the results of the test or tool are due to the injury, condition, or illness and not to poor reliability of the test or tool itself. The greater the reliability of the instrument, the smaller the number of false-positive and false-negative results detected. In spite of the clinically acceptable test-retest reliability of the KD, the large CR makes it challenging to establish an appropriate level of least likely change postconcussion. A result that is slower than baseline may be helpful in the diagnosis or detection of a concussion, but a result that is at the baseline measurement or slightly faster may not be valuable in excluding a concussion diagnosis. Findings from the KD test may add information to the clinical decision-making process after a head injury; however, clinicians should continue to incorporate multiple measures to ensure accurate detection of SCs.