Anxiety and depressive disorders affect 1 in 3 people in their lifetime and are commonly comorbid with a range of psychiatric and medical conditions.[1–3] Despite their prevalence, these disorders are undertreated in the United States, particularly in rural areas where behavioral healthcare shortages are common.[4] Two well-validated groups of clinical inventories—comprising Generalized Anxiety Disorder (GAD) and Patient Health Questionnaire (PHQ) assessments—are commonly used in ambulatory settings to screen patients for anxiety and depression, respectively. Despite their utility, continued reliance on in-clinic administration limits the reach and cadence of these assessments and may preclude timely population- and individual-level insights. Emerging digital behavioral health (dBH) technologies that enable mobile and web-based remote GAD and PHQ administration promise to offer such insights by increasing assessment frequency and promoting compliance. These technologies can provide a foundation to increase clinical efficiency, support near real-time risk stratification, and enhance clinical decision support. Here, we compared scores from assessments given in the clinic with those given via a dBH platform in a retrospective study of patients from a private primary healthcare system. In-clinic assessments were collected between November 2018 and April 2024; dBH-administered assessments were collected between November 2021 and April 2024.

All dBH-administered GADs and PHQs were self-assessed by patients using NeuroFlow, a Health Insurance Portability and Accountability Act (HIPAA)-compliant mobile- and desktop-accessible platform that allows remote completion of screeners, symptom tracking, and engagement in both prescribed and self-directed behavioral healthcare activities.[5,6] Providers had access to a dashboard within the platform to enable real-time clinical decision support and longitudinal monitoring of patient well-being throughout the study. Only patients who completed both in-clinic and dBH-administered assessments of the same type were included in our analyses. To account for mixed assessment types (e.g., PHQ-9, PHQ-2/9, etc.), we collapsed all assessments of the same type into either “GAD” or “PHQ” bins. Our study was conducted in accordance with the principles of the Declaration of Helsinki and was exempted from oversight and waived for informed consent by Advarra’s Center for Institutional Review Board Intelligence.

Our hypothesis was twofold. First, we hypothesized that compared to in-clinic assessments, dBH-administered assessments would allow for early identification of rising risk. Rising risk refers to a significant increase in a patient’s symptom severity over time, which we conservatively defined as an increase greater than or equal to 5 points in assessment score coupled with a categorical worsening of symptom severity (e.g., from “mild” to “moderate”). For example, if a patient had a clinic-administered GAD score of 5, followed by a dBH-administered GAD score of 10, this would constitute a rising risk case, as the score increased by 5 points and the severity transitioned from “mild” to “moderate.”

In addition, we hypothesized that dBH-administered assessments would uncover unique cases of hidden risk. We defined hidden risk using the same criteria as rising risk (i.e., a ≥ 5-point increase in score along with a categorical worsening of severity), but with one key difference: in hidden risk cases, the dBH-administered assessment is the only indication of elevated symptom severity. For instance, if a patient had an in-clinic GAD score of 5, followed by a dBH GAD score of 10, and then returned to an in-clinic GAD score of 5, this would constitute a hidden risk case, as the severity increase was only captured in the dBH assessment. It is important to note that per our operational definitions, all hidden risk cases are first classified as rising risk but not all rising risk cases become hidden risk. This is because rising risk may later be confirmed in a clinical setting, whereas hidden risk remains identified solely through the dBH-administered assessments.

A total of 2797 (mean [SD] score: 4.89 [5.04]) GAD and 3476 (mean [SD] score: 6.51 [5.67]) PHQ assessments were collected from 587 patients (212 women, 142 men, 233 others; mean [SD] age: 50.8 [15.9] y). In-clinic assessments comprised 1129 GAD (mean [SD] score: 5.70 [5.19]) and 1604 PHQs (mean [SD] score: 7.48 [5.95]). The dBH-administered assessments included 1668 GAD (mean [SD] score: 4.34 [4.86]) and 1872 PHQs (mean [SD] score: 5.68 [5.28]). Of note, mean scores for in-clinic GAD and PHQ assessments were significantly higher than dBH-administered assessments (p < 0.001 for both).

We used time-series analysis with Boolean thresholding operations to examine each patient’s assessment scores across the study period. We found 78 total rising risk cases in 76 of our 587 participants (12.9%). In 42 of those 78 cases, the patient’s assessment score transitioned from nonclinical (i.e., <10), indicating “minimal” or “mild” symptom severity, to clinical (i.e., ≥10), indicating “moderate” or worse symptom severity. Quartile ranges for the lead time afforded by dBH-administered assessments ahead of subsequent in-clinic assessments among 66 rising risk patients who went on to complete an in-clinic assessment by the time of our study were 2, 9, and 86 days, underscoring the potential of remote assessments to provide actionable patient insights between visits.

Next, we searched for instances of hidden risk by further examining the 78 rising risk cases. Recalling the structure of our hidden risk analysis (i.e., an in-clinic assessment, followed by a dBH-administered assessment, followed by another in-clinic assessment), we treated the first in-clinic assessment in each instance as the patient’s baseline symptom severity, and examined the second in-clinic assessment in the context of that baseline. In 12 of the 78 rising risk cases, the patient’s second in-clinic assessment had returned to baseline without behavioral intervention—demonstrating that an exclusively in-clinic approach to assessment administration is likely to miss actionable patient insights.

In a final ad hoc analysis, we compared exclusively in-clinic assessments with those administered via the dBH platform to evaluate noninferiority. To do this, we divided all assessments administered between November 2021 and April 2024 by their source (i.e., clinic- or dBH-administered) and computed Kapan-Meier survival curves[7] featuring the detection of rising risk as defined previously as an “event” in a survival analysis context (Fig. 1). A log-rank analysis revealed no significant difference between the curves (χ2 = 0.67, p = 0.43), suggesting that in-clinic and dBH-administered approaches are comparable in their ability to surface risk over time. This finding provides cursory evidence that a combined in-clinic/dBH approach to assessment administration may be particularly effective in surfacing risk, although prospective longitudinal studies will be needed to test this hypothesis.

Figure 1

Kaplan-Meier survival curves for exclusively in-clinic and dBH-administered GAD and PHQ assessments between November 2021 and April 2024. These curves depict the probability of undetected rising risk over time for each assessment type, comparing the two contexts of administration. We found no significant difference between in-clinic and dBH-administered assessments in their ability to surface risk (log-rank test: χ2 = 0.67, p = 0.43). The shaded areas represent 95% CIs for each curve. These results suggest that both assessment approaches are similarly effective at identifying rising risk, although future studies with larger samples are warranted to confirm this observation.

Figure 1

Kaplan-Meier survival curves for exclusively in-clinic and dBH-administered GAD and PHQ assessments between November 2021 and April 2024. These curves depict the probability of undetected rising risk over time for each assessment type, comparing the two contexts of administration. We found no significant difference between in-clinic and dBH-administered assessments in their ability to surface risk (log-rank test: χ2 = 0.67, p = 0.43). The shaded areas represent 95% CIs for each curve. These results suggest that both assessment approaches are similarly effective at identifying rising risk, although future studies with larger samples are warranted to confirm this observation.

Close modal

Our findings attest to the utility of remote assessments to yield actionable population-health insights and suggest that approaches such as annual screenings,[8,9] although unquestionably a step forward, nevertheless remain insufficient to address patients’ evolving behavioral health challenges. Understanding the prevalence of psychiatric disorders and their population trends can inform strategic programs aimed at efficiently meeting patients’ needs. Healthcare systems and payer networks can implement dBH screenings to augment existing approaches and gain more robust insights into these trends. At the individual patient-level, remote assessments for anxiety and depression provide an opportunity for early intervention in rising risk situations, as well as improved clinical decision support and individualized care. We found that in-clinic GAD and PHQ scores were significantly higher on average than dBH-administered assessments; nevertheless, dBH assessments were just as effective in detecting rising risk over time (Fig. 1). This suggests that dBH assessments can serve as a valuable tool for identifying risk remotely, particularly in settings where in-clinic assessments are not feasible or frequent. Remote assessments also give patients a means to be more forthcoming about their emotional states in a private setting, bolstering their ability to surface hidden risk (see Miller et al.[10] and Gratch et al.[11] for evidence of patients underreporting distress on clinic-administered screenings). Importantly, dBH technologies are geographically agnostic and can meet patients of all socioeconomic strata “where they are”; thus, they may be especially well-suited to extend health equity in America’s mental health deserts where psychiatric professionals are scarce,[4] and across socially vulnerable and historically marginalized communities where risk of suicide is significantly elevated.[12]

1.
Vos
T,
Abajobir
AA,
Abate
KH,
et al
Global, regional, and national incidence, prevalence, and years lived with disability for 328 diseases and injuries for 195 countries, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016
.
Lancet
.
2017
;
390
:
1211
1259
.
2.
Merikangas
KR,
Swanson
SA.
Comorbidity in anxiety disorders
.
Curr Top Behav Neurosci
.
2010
;
2
:
37
59
.
3.
Kang
HJ,
Kim
SY,
Bae
KY,
et al
Comorbidity of depression with physical disorders: research and clinical implications
.
Chonnam Med J
.
2015
;
51
:
8
18
.
4.
Livingston
K,
Green
M.
America’s mental health care deserts: where is it hard to access care? [Internet]
.
ABC News
.
2022
. Accessed Apr 11, 2024. https://abcnews.go.com/Health/americas-mental-health-care-deserts-hard-access-care/story?id=84301748
5.
Holley
D,
Lubkin
S,
Brooks
A,
et al
Digital CBT interventions predict robust improvements in anxiety and depression symptoms: a retrospective database study
.
IDDB
.
2024
;
4
:
53
55
.
6.
Hartz
M,
Hickey
D,
Acosta
L,
et al
Using natural language processing to detect suicidal ideation and prompt urgent interventions: a retrospective database study
.
IDDB
.
2024
;
4
:
6
8
.
7.
Goel
MK,
Khanna
P,
Kishore
J.
Understanding survival analysis: Kaplan-Meier estimate
.
Int J Ayurveda Res
.
2010
;
1
:
274
278
.
8.
Blackstone
SR,
Sebring
AN,
Allen
C,
et al
Improving depression screening in primary care: a quality improvement initiative
.
J Community Health
.
2022
;
47
:
400
407
.
9.
Garcia
ME,
Hinton
L,
Neuhaus
J,
et al
Equitability of depression screening after implementation of general adult screening in primary care
.
JAMA Netw Open
.
2022
;
5
:
e2227658
.
10.
Miller
DP
Jr
Foley
KL,
Bundy
R,
et al
Universal screening in primary care practices by self-administered tablet vs nursing staff
.
JAMA Netw Open
.
2022
;
5
:
e221480
.
11.
Gratch
I,
Choo
TH,
Galfalvy
H,
et al
Detecting suicidal thoughts: the power of ecological momentary assessment
.
Depress Anxiety
.
2021
;
38
:
8
16
.
12.
Liu
S,
Morin
SB,
Bourand
NM,
et al
Social vulnerability and risk of suicide in US adults, 2016-2020
.
JAMA Netw Open
.
2023
;
6
:
e239995
.

Competing Interests

Source of Support: This study was funded by NeuroFlow, Inc. The data described herein were collected and analyzed as a facet of routine healthcare operations undertaken by NeuroFlow and The Villages Health.

Conflict of Interest: Dan Holley, Amanda Brooks, and Tom Zaubler are employees of NeuroFlow, Inc., whose product, the NeuroFlow digital behavioral health platform, was used in this study. Sheila Thomas and Robert Reilly are employees of The Villages Health, a NeuroFlow client organization and research partner.

This work is published under a CC-BY-NC-ND 4.0 International License.