The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×

Abstract

Objective:

In this secondary analysis of data from the Veterans Affairs Augmentation and Switching Treatments for Improving Depression Outcomes (VAST-D) study, the authors sought to determine the effectiveness of early improvement (or lack thereof) for predicting remission from depression with antidepressant therapy.

Methods:

This study used data from the VAST-D study, a multisite, randomized, single-blind trial with parallel assignment to one of three medication interventions for 1,522 veterans whose major depressive disorder was unresponsive to at least one course of antidepressant treatment meeting minimal standards for dosage and duration. The authors calculated the positive predictive value (PPV) and negative predictive value (NPV) of early improvement on remission, response, or greater than minimal improvement from depression for various degrees of improvement (10%–50%) on the Quick Inventory of Depressive Symptomatology–Clinician Rated (QIDS-C) at 1, 2, 4, and 6 weeks.

Results:

The end of week 2 of treatment was identified as the best time to evaluate early improvement. The presence of a ≥20% drop from the baseline QIDS-C score by the end of week 2 resulted in a PPV for remission of 38% and an NPV of 93% by week 12. Extending the observational window to week 6 minimally improved NPV (97%). This association did not differ across treatment groups.

Conclusions:

A lack of early improvement at the end of week 2 of antidepressant therapy can be used to inform clinical decisions on the likelihood of nonremission of depression during the subsequent 10 weeks, even when dosage optimization is incomplete.

HIGHLIGHTS

  • The optimal time for evaluating early improvement from an antidepressant medication regimen is at the end of week 2.

  • A lack of early improvement at the end of week 2 of antidepressant therapy can be used to inform clinical decisions on the likelihood of nonremission of depression with that therapy during the subsequent 10 weeks, even when dosage optimization is incomplete.

  • The same factors that influence early improvement also determine whether a patient will show a false negative outcome (i.e., achieve remission by the end of week 12 despite no early improvement): lower baseline Quick Inventory of Depressive Symptomatology–Clinician Rated score, fewer adverse childhood experiences, lower baseline anxiety, lower suicidal ideation, and higher baseline quality of life score.

  • The utility of using lack of early improvement to predict lack of remission in antidepressant therapy did not depend on treatment allocation.

Major depressive disorder is a significant health concern not only because it is one of the most prevalent psychiatric disorders (1), but because it accounts for the greatest number of disability-adjusted life years among psychiatric disorders (2). Proper management is therefore critical. There is consensus about which drugs to choose at the initiation of antidepressant medication therapy (e.g., a selective serotonin reuptake inhibitor [SSRI]) and what the target dosages of these drugs should be (3). It is also generally agreed that the optimal endpoint should be remission of symptoms and that it is prudent for clinicians to adjust the medication therapy until remission is achieved (46). However, when antidepressant medication therapy does not result in the expected improvement, the decision-making process becomes complicated. For example, if remission is not achieved, should the clinician accept a lower level of improvement? Also, when should the first decision-point occur? Knowing when to alter the medication treatment and knowing the probability of achieving greater improvement at each decision point could save weeks to months of unnecessary suffering and minimize the adverse consequences of ineffectively treated depression.

In the management of depression of patients who do not adequately respond to initial therapy, it is critical to determine when a patient will need to proceed to a next-step medication. In a meta-analysis covering 17 studies and 14,779 patients, the role of early improvement (i.e., a ≥20% drop from baseline depression severity score on either the Hamilton Depression Rating Scale or the Montgomery-Asberg Depression Rating Scale at the end of 2 weeks of medication therapy) was assessed (7). Nearly two-thirds (63%) of patients treated with an antidepressant showed early improvement, whereas only 47% of patients treated with placebo did. The use of early improvement accurately predicted those patients who would ultimately achieve remission by 8–12 weeks in 42% of the patients (positive predictive value [PPV]); more importantly, the absence of a ≥20% early improvement predicted the lack of ultimate remission for 90% of the patients (negative predictive value [NPV]). Early improvers were 8.4 times more likely to be identified as a later responder to the medication and 6.4 times more likely to achieve remission than a patient who showed no early improvement. Other meta-analyses, evaluating data on fewer participants, have also provided evidence supporting early improvement as a predictor of ultimate remission (811). In these studies, lack of early improvement has been the most reliable predictor of nonremission. In addition, in one meta-analysis, a slightly higher NPV (94%) was noted when the early improvement observation period was extended to 4 weeks (10).

The Veterans Affairs Augmentation and Switching Treatments for Improving Depression Outcomes (VAST-D) study is the largest next-step trial for individuals who did not adequately respond to an initial antidepressant (4, 12, 13). Our goal in this secondary analysis of the VAST-D data was to explore the effectiveness of using early improvement (i.e., a drop from the baseline depression severity score as measured by the Quick Inventory of Depressive Symptomatology–Clinician Rated [QIDS-C] within the first few weeks of antidepressant treatment) to predict remission, response, or greater than minimal improvement during the acute phase of the trial (the first 12 weeks of treatment).

Methods

Compliance

The U.S. Department of Veterans Affairs (VA) Office of Research and Development and the Central Institutional Review Board (CIRB) approved the VAST-D study. A certificate of confidentiality was obtained for the study from the National Institutes of Health. The CIRB conducted annual continuing reviews, and a data monitoring committee (DMC) reviewed the study biannually. Adverse events were reviewed by both the CIRB and DMC throughout the study. All participants provided written informed consent and privacy authorization after receiving full explanation of the study procedures.

Study Design

VAST-D was a multisite (see the online supplement for a list of participating sites), randomized, single-blind, parallel-assignment next-step trial of veterans whose major depressive disorder was suboptimally responsive to at least one course of antidepressant treatment with an SSRI, serotonin and norepinephrine reuptake inhibitor, or mirtazapine that met or exceeded minimal standards for dosage and duration of treatment. Suboptimal response was defined as a score of ≥16 (indicating severe depression) on the QIDS-C questionnaire after at least 6 weeks of treatment or a score of ≥11 (indicating moderate depression) after at least 8 weeks of treatment, with the three most recent weeks at a stable, “optimal” dosage (4, 12, 13).

A full description of the overall design of the VAST-D study (including the Consolidated Standards of Reporting Trials [CONSORT]) statement and flow diagram) has been published previously (4, 12, 13).

Participants

Participants were 1,522 Veterans Health Administration (VHA) patients, 18 years or older and diagnosed as having major depressive disorder, who were referred by their VHA clinicians. Study clinicians confirmed the diagnosis prior to study enrollment. Research staff further established diagnostic eligibility using criteria from the DSM-IV-TR. Potential participants who were pregnant; breastfeeding; currently using contraindicated medications, including either study drug; or had a clear history of nonresponse or intolerance to bupropion-SR or aripiprazole, were excluded from the study. Participants who had a primary diagnosis of bipolar, psychotic, obsessive-compulsive, dementia, or eating disorders; had general medical conditions contraindicating the use of bupropion-SR or aripiprazole; had serious, unstable medical conditions requiring acute treatment; met criteria for substance dependence requiring inpatient detoxification; or were considered at high risk for suicide and in need of acute treatment were also excluded.

Interventions

This report addresses the acute phase (first 12 weeks of treatment) of the VAST-D study, in which 1,522 veterans with nonpsychotic major depressive disorders were randomized to one of three treatment groups: augmentation with bupropion-SR (Aug-BUP), augmentation with aripiprazole (Aug-ARI), or switch to another antidepressant (i.e., bupropion-SR [Switch-BUP]) (4, 12, 13). For the treatment groups receiving them, the dosage of index antidepressants remained relatively constant throughout the trial. Treatments included titration (cross-titration for the Switch-BUP group) from standard starting dosages of 150 mg bupropion-SR with titration up to 400 mg daily or 2 mg aripiprazole with titration up 15 mg daily, until depressive symptoms remitted or side effects were intolerable. Dosage adjustments were guided by participant responses on the Patient Health Questionnaire (14) and a Frequency, Intensity, and Burden of Side Effects Rating (15) obtained at each visit. Treatment visits occurred at baseline and at the end of weeks 1, 2, 4, 6, 8, 10, and 12.

Baseline Assessments

Baseline measures in this analysis included age, marital status, education, employment status, race-ethnicity, number of lifetime episodes of major depressive disorder, duration of the current episode, number of past medication trials with antidepressants, presence of a substance or alcohol abuse diagnosis (Mini-International Neuropsychiatric Interview score) (16), severity of childhood adverse experiences (Adverse Childhood Experiences Survey score ) (17), severity of grief (Complicated Grief Questionnaire score) (18), severity of suicidal ideation (Columbia-Suicide Severity Rating Scale [C-SSRS] score) (19), severity of anxiety (Beck Anxiety Inventory score) (20), presence of mixed features as measured by a self-rated 9-item mixed features scale based on the DSM-5, severity of health impairment as measured by the Cumulative Illness Rating Scale (CIRS) (21), general life satisfaction as measured by the Quality of Life Enjoyment and Satisfaction Questionnaire–Short Form (Q-LES-Q-SF) (22), QIDS-C score (23), and duration of the index treatment trial (in months).

Primary Outcome Measure

The primary outcome measure, the QIDS-C score, was collected by an independent evaluator who was blind to the patients’ treatment assignments at baseline and at each visit following randomization. Standard definitions of “response” (≥50% decrease from baseline QIDS-C score at the end of week 12), and “remission” (QIDS-C scores ≤5 on two consecutive evaluations anytime during the 12-week acute phase) were used. In addition, “greater than minimal improvement” was defined as a >30% decrease from baseline QIDS-C score at the end of week 12. Except in exploratory analyses, early improvement was defined as a ≥20% drop from baseline QIDS-C score by the end of week 2.

Statistical Analysis

We conducted the statistical analysis by using observed cases. We calculated the PPV and NPV of early improvement on remission. To calculate PPV and NPV, we categorized participant outcomes as true positive (TP), false positive (FP), true negative (TN), and false negative (FN). A TP outcome was defined as having a ≥20% drop from baseline QIDS-C score by the end of week 2 (early improvement) and achieving remission by the end of week 12. A FP outcome was defined as showing early improvement but not achieving remission by the end of week 12. A TN outcome was one in which the participant did not demonstrate early improvement and did not achieve remission by the end of week 12. A FN outcome was one in which the participant did not show early improvement but achieved remission by the end of week 12. PPV and NPV were calculated as PPV=TP/(TP+FP) and NPV=TN/(TN+FN). We calculated sensitivity as the ratio of true positive outcomes to the total number of patients achieving remission (sensitivity =TP/[TP+FN]) and specificity as the ratio of true negative outcomes to the total number of patients not achieving remission (specificity=TN/[TN+FP]). The relative likelihood of remission, response, and greater than minimal improvement between those displaying early improvement and those who did not was calculated as the unadjusted odds ratios from 2×2 frequency tables.

To identify the optimal drop in baseline QIDS-C score and the observational window to achieve the best PPV and NPV values, we calculated the PPVs and NPVs for multiple percentage drops (10%, 20%, 30%, 40%, and 50%) and at various observational windows (weeks 1, 2, 4, and 6).

We identified baseline characteristics associated with early responders and participants exhibiting false negative outcomes by using chi-square tests for categorical variables and Wilcoxon rank sum tests for continuous variables. We calculated effect sizes (Cohen’s d) as the difference of the means divided by the pooled standard deviation. We conducted a chi-square analysis to compare withdrawal rates between early improvers and those who did not have early improvement. We used chi-square analysis to perform area-under-the-curve comparisons of receiver operating curves to determine the generalizability of using early improvement to predict remission.

Results

Sixty-two percent of the sample showed a ≥20% drop from the baseline QIDS-C score by the end of week 2 (early improvement). Table 1 shows that early improvement resulted in a PPV for remission of 38% and an NPV for remission of 93%. The odds of achieving remission, response, and greater than minimal improvement was higher among individuals who exhibit early improvement (odds ratio [OR]=7.7, 95% confidence interval [CI]=5.4–11.1; OR 3.5, 95% CI=2.7–4.6; and OR 3.6, 95% CI=2.7–4.9, respectively). The corresponding sensitivity and specificity for remission were 91% (95% CI=87.6–93.5) and 44% (95% CI=40.6–46.7), respectively.

TABLE 1. Calculation of the positive predictive value (PPV) and negative predictive value (NPV) for a ≥20% drop in QIDS-C score from baseline to the end of week 2 (early improvement) among 1,522 veterans with depressiona

Patient status at end of week 12Early improvementNo early improvement
TPbFPcPPVFNdTNeNPV
(N)(N)(%)95% CI(N)(N)(%)95% CIORf95% CI
Remissiong35958138.235.0–41.43645092.690.3–94.97.75.4–11.1
Responseh52724268.566.5–70.513121261.857.4–66.03.52.7–4.6
GTMIi65811185.683.7–87.221313037.934.1–41.83.62.7–4.9

aQIDS-C, Quick Inventory of Depressive Symptomatology–Clinician Rated. Possible scores range from 0 to 27, with higher scores indicating greater severity of depression.

bTP, true positives (participants who exhibited a ≥20% drop in QIDS-C score by the end of week 2 [early improvement] and who achieved remission by the end of week 12).

cFP, false positives (participants who showed early improvement but did not achieve remission by the end of week 12).

dFN, false negatives (participants who did not show early improvement but achieved remission by the end of week 12).

eTN, true negatives (participants who did not demonstrate early improvement and did not achieve remission by the end of week 12.

fOR, odds ratio=(true positives)×(true negatives)/(false positives)×(false negatives).

gRemission was defined as QIDS-C scores ≤5 on two consecutive evaluations anytime during the 12-week acute phase (analysis included all participants with a week 2 assessment [N=1,426]).

hResponse was defined as a ≥50% drop from baseline QIDS-C score at the end of week 12 (analysis included only participants with a week 2 assessment who completed follow-up to week 12 [N=1,112]).

iGTMI, greater than minimal improvement, was defined as a >30% drop from baseline QIDS-C score at the end of week 12 (analysis included only participants with a week 2 assessment who completed follow-up to week 12 [N=1,112]).

TABLE 1. Calculation of the positive predictive value (PPV) and negative predictive value (NPV) for a ≥20% drop in QIDS-C score from baseline to the end of week 2 (early improvement) among 1,522 veterans with depressiona

Enlarge table

At baseline, early improvers were more likely to have been allocated to receive Aug-ARI, have a greater number of lifetime episodes of depression, have less severe suicidal ideation, less anxiety, and higher quality of life (Table 2), although the effect sizes for these associations were small (Cohen’s d=0.12–0.25). The highest level of education attained, marital status, employment status, presence of substance abuse, severity of grief, baseline QIDS-C score, age at enrollment, number of lifetime antidepressant trials, severity of childhood adverse experiences, presence of mixed features as measured by a self-rated 9-item mixed features scale based on the DSM-5, severity of health impairment (as measured by the CIRS), and duration of index treatment trial did not influence whether early improvement was present. Patients who did not have early improvement but achieved remission during the trial (i.e., had a false negative outcome) were more likely to have a lower baseline QIDS-C score, fewer adverse childhood experiences, lower baseline Beck Anxiety Inventory score, lower C-SSRS score, and a higher baseline quality of life (Q-LES-Q-SF) score (Table 3).

TABLE 2. Characteristics of early improvers and early nonimprovers among 1,522 veterans with depressiona

Early improvers(N=940)
Early nonimprovers(N=486)
CharacteristicN%N%pCohen’s d
Treatment allocation.008NA
 Switch-BUP29331.218037.0
 Aug-BUP30932.916934.8
 Aug-ARI33836.013728.2
Education.578NA
 Some college36638.918738.5
 High school or less25627.214529.8
 Associate’s degree13113.95711.7
 Bachelor’s or higher18719.99720.0
Marital status.468NA
 Married/cohabitating42845.521945.1
 Divorced/separated34536.718638.3
 Never married12913.76914.2
 Widowed384.0122.5
Employment status.308NA
 Employed25026.611323.3
 Retired29531.415231.3
 Unemployed39241.722045.3
Substance or alcohol abuse.313NA
 Yes12613.45611.5
 No81486.643088.5
CGQb.190NA
 ≤339842.318838.7
 >354257.729861.3
QIDS-Cc (M±SD)16.7±3.2216.6±3.33.387NA
Age (M±SD years)54.1±12.4255.0±11.58.409NA
Lifetime episodes of depression (M±SD)2.64±1.352.45±1.37.012.14
Lifetime antidepressant trials (M±SD)2.33±1.722.42±1.66.084NA
ACESd (M±SD)3.15±2.513.17±2.60.98
C-SSRSe (M±SD).75±1.21.90±1.30.016.12
BAIf (M±SD).86±.52.99±.54<.0001.25
DSM-5 mixed featuresg (M±SD)11.6±2.5911.6±2.56.78NA
CIRSh (M±SD)1.83±.381.80±.35.11NA
Q-LES-Q-SFi (M±SD)42.1±14.338.6±14.1<.0001.25

aEarly improvers, participants with ≥20% drop from baseline Quick Inventory of Depressive Symptomatology–Clinician Rated (QIDS-C) score by the end of week 2. Frequencies and percents are used for categorical variables and means and standard deviations for continuous variables.

bCGQ, Complicated Grief Questionnaire. Possible scores range from 0 to 10, with higher scores indicating greater complicated grief.

cQIDS-C, Quick Inventory of Depressive Symptomatology–Clinician Rated. Possible scores range from 0 to 27, with higher scores indicating greater severity of depression.

dACES, Adverse Childhood Experiences Survey. Possible scores range from 0 to 10, with higher scores indicating greater childhood adversity and greater risk of psychological or health problems.

eC-SSRS, Columbia Suicide Severity Rating Scale-Suicidal Ideation. Possible scores range from 0 to 5, with higher scores indicating greater suicidal ideation or intent.

fBAI, Beck Anxiety Inventory. Possible scores range from 0 to 3 (average rating of each of the 21 items), with higher scores indicating greater anxiety.

gDSM-5 mixed features, presence of mixed features by a self-rated 9-item mixed features scale based on the DSM-5. Possible scores range from 9 to 27, with higher scores indicating more hypomanic or manic symptoms.

hCIRS, Cumulative Illness Rating Scale Comorbidity Index. Possible scores range from 0 to 4, with higher scores indicating greater severity of co-occurring medical conditions.

iQ-LES-Q-SF, Quality of Life Enjoyment and Satisfaction Questionnaire–Short Form. Possible scores range from 0% to 100% of the maximum scale score of 70, with higher scores indicating greater life satisfaction and enjoyment.

TABLE 2. Characteristics of early improvers and early nonimprovers among 1,522 veterans with depressiona

Enlarge table

TABLE 3. Baseline measures influencing achievement of remission by week 12 despite no early improvement (false negative outcome)

False negative (N=36)True negative (N=450)
Baseline MeasureN%N%pCohen’s d
Treatment allocation0.7NA
 Switch-BUP1130.616937.6
 Aug-BUP1438.915534.4
 Aug-ARI1130.612628.0
Education0.53NA
 Some college1438.917338.4
 High school or less1336.113229.3
 Associate’s degree513.95211.6
 Bachelor’s degree or higher411.19320.7
Marital status0.45NA
 Married/cohabitating1336.120645.8
 Divorced/separated1747.216937.6
 Never married616.76314.0
 Widowed00.0122.7
Employment status0.5NA
 Employed1027.810322.9
 Retired1336.113930.9
 Unemployed1336.120746.1
Substance or alcohol abuse0.78NA
 Yes38.35311.8
 No3391.739788.2
CGQa0.7NA
 ≤31541.717338.4
 >32158.327761.6
QIDS-Cb (M±SD)14.0±3.116.7±3.3<0.0010.84
Age (M±SD years)53.7±13.355.0±11.40.56NA
Lifetime episodes of depression (M±SD)2.6±1.22.4±1.40.27NA
Lifetime antidepressant trials (M±SD)2.6±1.92.4±1.60.72
ACESc (M±SD)2.3±2.23.2±2.60.040.37
C-SSRSd (M±SD)0.52±1.10.93±1.30.050.34
BAIe (M±SD)0.71±0.51.01±0.50.0010.61
DSM-5 mixed featuresf (M±SD)11.0±2.111.7±2.60.20NA
CIRSg (M±SD)1.7±0.391.80±0.340.42NA
Q-LES-Q-SFh (M±SD)46.2±13.638.0±13.9<0.0010.59

aCGQ, Complicated Grief Questionnaire. Possible scores range from 0 to 10, with higher scores indicating greater complicated grief.

bQIDS-C, Quick Inventory of Depressive Symptomatology–Clinician Rated. Possible scores range from 0 to 27, with higher scores indicating greater severity of depression.

cACES, Adverse Childhood Experiences Survey. Possible scores range from 0 to 10, with higher scores indicating greater childhood adversity and greater risk of psychological or health problems.

dC-SSRS, Columbia Suicide Severity Rating Scale-Suicidal Ideation. Possible scores range from 0 to 5, with higher scores indicating greater suicidal ideation or intent.

eBAI, Beck Anxiety Inventory. Possible scores range from 0 to 3 (average rating of each of the 21 items), with higher scores indicating greater anxiety.

fDSM-5 mixed features, presence of mixed features by a self-rated 9-item mixed features scale based on the DSM-5. Possible scores range from 9 to 27, with higher scores indicating more hypomanic or manic symptoms.

gCIRS, Cumulative Illness Rating Scale Comorbidity Index. Possible scores range from 0 to 4, with higher scores indicating greater severity of co-occurring medical conditions.

hQ-LES-Q-SF, Quality of Life Enjoyment and Satisfaction Questionnaire-Short Form. Possible scores range from 0% to 100% of the maximum scale score of 70, with higher scores indicating greater life satisfaction and enjoyment.

TABLE 3. Baseline measures influencing achievement of remission by week 12 despite no early improvement (false negative outcome)

Enlarge table

Of the 940 participants who met the criterion for early improvement, 143 were withdrawn from the study (15%) for various reasons that have been described previously (12). Of the 582 participants who did not meet the criterion for early improvement, 171 (29%) were withdrawn from the study during the acute phase of treatment. These rates were significantly different according to a chi-square analysis (χ2=23.53, df=1, p<0.001).

The PPV for remission was mostly influenced by the magnitude of the percentage drop in QIDS-C score from baseline, with no obvious benefit provided by the duration of the observation window (Figure 1). The NPV, in contrast to the PPV, was influenced to some extent by the duration of the observation window. The NPV for greater than minimal improvement was low to moderate at all observation periods evaluated. While the use of at least a 20% drop from the baseline QIDS-C score by the end of week 2 may not provide the strongest NPV for remission, NPV improved only 4% (from 93% to 97%) when we extended the observation period to 6 weeks. Receiver operating curves for the ability of early improvement at week 2 to predict remission as a function of treatment allocation are presented in Figure 2. Area-under-the-curve comparisons did not support an influence of treatment on the predictive ability of early improvement.

FIGURE 1.

FIGURE 1. Positive predictive values and negative predictive values based on percentage drop from baseline QIDS-C score over various observational periods for remission, response, and greater than minimal improvementa

aQIDS-C, Quick Inventory of Depressive Symptomatology–Clinician Rated. Because of withdrawals, the number of participants included in the analysis for remission was 1,458; 1,426; 1,367; and 1,283 for weeks 1, 2, 4, and 6, respectively, and the number of participants included in the analysis for response and minimal improvement was 1,108; 1,112; 1,114; and 1,108 for weeks 1, 2, 4, and 6, respectively.

FIGURE 2.

FIGURE 2. Influence of treatment group on predictive value of early improvement at week 2a

aAug-ARI, augmentation with aripiprazole; Aug-BUP, augmentation with bupropion-SR; Switch-Bup, switching to another antidepressant (i.e., bupropion-SR). Increasing sensitivity and specificity are plotted for 10%, 20%, 30%, 40%, and 50% reductions from baseline Quick Inventory of Depressive Symptomatology–Clinician Rated score at week 2, respectively, for the three medication regimens. Area-under-the-curve comparisons by chi-square analysis did not support an influence of treatment assignment on prediction of remission by early improvement (Switch-BUP vs. Aug-BUP, χ2=0.056, p=0.81; Switch-BUP vs. Aug-ARI, χ2=0.0003, p=0.99; Aug-BUP vs. Aug-ARI, χ2=0.054, p=0.82).

The average prescribed dosages of bupropion at the end of week 2 were 237 mg and 221 mg for the Switch-BUP and Aug-BUP groups, respectively. The average dosage of aripiprazole at the end of week 2 was 3 mg (a full description of average dosages by time observation point is provided in a table in the online supplement).

Discussion and Conclusions

For any antidepressant medication trial, it is important to identify as early as possible whether the patient is likely to achieve remission with the current treatment regimen. In this analysis of the VAST-D study, which consisted of participants who were inadequately responsive to an initial antidepressant trial, we demonstrated that 62% exhibited a ≥20% drop from the baseline QIDS-C score by the end of week 2 and that this early improvement (or lack of improvement) had a PPV of 38% and an NPV of 93% for prediction of remission by the end of week 12. In addition, our data show that those who reached the 20% threshold of early improvement by week 2 were more likely by the end of week 12 to achieve greater than minimal improvement or response, compared with patients who did not show this level of early improvement. In a smaller study of participants who had not responded to an initial antidepressant trial, venlafaxine was the only antidepressant studied (10). The data from that study suggested a greater benefit from assessing improvement at the end of week 4 instead of week 2, although the magnitude of the NPV and the pattern of the NPV acting as a better predictor than the PPV were similar. In that study, predictive values were evaluated only at weeks 2 and 4 for >20% or >30% drops from the baseline depression score. In the present study, we systematically studied multiple time observation windows and percentage drops from the baseline depression score. We also allowed dosage adjustment as early as the end of week 1. This difference may have contributed to the higher NPV values. Early improvement was also found to be useful as a predictor of subsequent remission in a trial of electroconvulsive therapy (ECT), although early improvement with ECT appeared to provide a higher PPV than NPV (2426). Thus, the preponderance of evidence supports the importance of early improvement (or lack thereof) in predicting later remission and response in patients with major depressive disorder. Although we identified five factors (allocation to Aug-ARI, more lifetime episodes of depression, less severe suicidal ideation, less anxiety, and a higher baseline quality of life score) that influenced achieving early improvement, the effect sizes of the influence of these factors were of a small magnitude (Cohen’s d=0.12–0.25).

The present study bolsters the proposed use of the lack of early improvement as a predictor of failure to achieve remission with the current medication. In fact, in the VAST-D study, the NPV for early improvement was over 92%. The lack of early improvement contributes to identifying a majority of those who will not ultimately demonstrate remission of symptoms with the current treatment, even if the dosage is increased to the optimal therapeutic dosage. Therefore, if there is not at least a 20% drop from the baseline QIDS-C score by the end of week 2, there is <8% chance of achieving remission, just over a one-in-three (38%) chance of reaching the response criterion, and a five-eighths (62%) chance of achieving greater than minimal improvement at the end of week 12 with continuation of the medication. In contrast to the prediction of remission, when predicting response and greater than minimal improvement, PPV is generally a better predictor than NPV (Figure 1). The predictive ability of PPV did not differ across treatment groups.

Those who did not achieve early improvement were nearly twice as likely to be withdrawn from the study than those who achieved early improvement (30% vs. 15%, respectively). Study withdrawal may account, at least in part, for the low remission and response rates among patients who did not experience early improvement. It would be important to learn whether more perseverance would have resulted in better outcomes for some of these patients. The present results suggest that a change in intervention is likely warranted relatively early in a medication trial if early improvement is not evident. However, specific patient groups may benefit from a longer duration of the intervention.

Identifying the characteristics of patients who would benefit from additional time is important, as is developing strategies to enhance treatment adherence when improvement is slower than anticipated. Evaluation of the factors influencing a false negative outcome sheds some light on this issue. Participants who did not show early improvement but achieved remission by the end of week 12 (false negative outcome) were more likely to have lower baseline QIDS-C scores, fewer adverse childhood experiences, lower baseline Beck Anxiety Inventory score, lower C-SSRS scores, and higher baseline quality of life (Q-LES-Q-SF) scores. These findings are similar to the factors influencing inclusion in the early improvement group, but the effect sizes were much larger among the participants classified as having false negative outcomes (0.37–0.84 vs. 0.12–0.25, respectively).

The use of early clinical improvement to predict remission has been reviewed by Lam (27). Four basic points were supported in the review: most improvement occurs during the first 2 weeks of treatment (28), early improvement differentiates SSRIs from placebo (29), early improvement is likely to be sustained (30), and early improvement predicts later remission (31) and better psychosocial functioning (32). Our data are consistent with findings that most of the improvement occurs early and is sustained and that there is utility in the use of early improvement or lack thereof to predict remission. We cannot comment on comparisons with placebo, because we did not use such a control in the VAST-D study. Although psychosocial functioning as an outcome measure is not addressed here, subsequent VAST-D reports will evaluate the role of psychosocial functioning and quality of life in these patients.

Does the use of early improvement as a predictor of remission make a difference in clinical decisions? Only one study has tested a strategy of changing the clinical management when early improvement (in the first 2 weeks) was not achieved during an initial trial of the antidepressant escitalopram (33). Only 192 of 879 participants (22%) in the Tadić et al. study met the predetermined criteria to enter the comparison group of early (week 2) medication change (to venlafaxine) or continuation of treatment as usual (escitalopram). The chosen endpoint of that study was remission as measured by the Hamilton Depression Rating Scale at week 8. While the data showed only a nonsignificant trend in the direction of early medication change providing a better outcome, a major confounding issue in the Tadić et al. study was that more patients in the treatment-as-usual group ultimately received the alternative intervention, venlafaxine, than those who had been allocated to switch to venlafaxine. In contrast to that trial, the VAST-D trial did not allow switching of treatments after initial assignment.

Strengths and Limitations

One of the strengths of the present analysis of the VAST-D data was the availability of a large patient population who received frequent, closely monitored visits with dosaging guided by measurement-based care. A second strength is that the study population focused on patients who had inadequate response to prior treatment for depression. These factors suggest that this study was ideally suited to determine the predictive value of early improvement. Because of the large patient population and multiple assessment visits early in the trial, we were able to bolster evidence provided by existing studies on the utility of effectively using early improvement (or lack thereof) as a guide for clinical management. Comparing our findings in a large sample of patients inadequately responding to an initial antidepressant trial with prior studies addressing the role of early improvement in initial antidepressant trials, it is apparent that the utility of determining the presence of early improvement is robust across clinical populations.

This study has some limitations. It is possible that some component of early improvement may be associated with the expectation of benefit associated with entering a randomized trial. Despite this concern, treatment duration in the trial had no impact on PPV. In contrast, there were modest changes in the NPV over time, which were greatest at the end of week 6 (Figure 1), consistent with an earlier report (10). Although the VAST-D study was conducted in a diverse sample with regard to most baseline characteristics (6, 13), the patient population was predominantly male (approximately 85%), which may cause some generalizability issues in populations with a greater proportion of women. Also, on average, participants were below the target dosage for their augmenting agents or bupropion when early improvement was assessed. While full characterization of factors influencing remission may require taking into account the optimal dosages of antidepressant medications, it is encouraging that in the present study we could use the absence of early improvement to predict likely failure to achieve remission before the full antidepressant dosage was achieved. However, the ultimate value of early improvement depends on whether changing interventions at the end of week 2 produces better outcomes.

Results must also be interpreted in the context of VAST-D being a “next-step” treatment study of patients who had already experienced inadequate response to at least one antidepressant trial. Thus, overall remission rates were relatively low, ranging from 22% for patients in the Switch-BUP group to 29% for those in the Aug-ARI group. These low remission rates resulted in a lower ceiling for the PPV. Higher overall remission rates were achieved in the initial treatment phase of the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study with “first-step” trials (6) and these likely would have been associated with higher PPVs. These caveats aside, a majority of patients seen in clinical settings ultimately require next-step strategies, and the results of this study are directly applicable to this large and important patient group.

Importance of Findings

Through this analysis, we reinforced existing literature that supports the utility of using early improvement in patients taking antidepressant medication to predict later remission, response, and greater than minimal improvement. Also, we were able to identify an optimal time for assessing early improvement. The predictive importance of lack of early improvement is based on the assumption that standard assessments of depression severity are obtained at least at baseline and at the end of week 2 of each new medication intervention.

Future Research

The recognition that lack of early improvement following initiation of an antidepressant medication regimen tells us only that the current therapy—even allowing for dosage escalation—is unlikely to be effective. However, the lack of early improvement does not tell us what the next step should be. The utility of using the absence of early improvement to enhance clinical outcomes should be evaluated in randomized controlled trials that test whether continuing the current treatment for a longer duration or switching to an alternative intervention is more effective for those failing to show early improvement.

Department of Psychiatry, Baylor Scott & White Health, and Texas A&M College of Medicine, Temple, Texas (Hicks); Yale University School of Public Health, New Haven, Connecticut (Sevilimedu); Cooperative Studies Program Coordinating Center, Veterans Affairs (VA) Connecticut Healthcare System, West Haven (Sevilimedu, Johnson); VA San Diego Healthcare System, San Diego (Tal, Zisook); Louis Stokes Cleveland VA Medical Center, Clevelend (Chen); Tuscaloosa VA Medical Center, Tuscaloosa, Alabama, and University of Alabama School of Medicine, Birmingham (Davis); Cooperative Studies Program Clinical Research Pharmacy Coordinating Center, Albuquerque, New Mexico (Vertrees); VA New England Mental Illness Research, Education, and Clinical Center, VA Connecticut Healthcare System, West Haven (Mohamed); Department of Psychiatry, University of California, San Diego (Zisook).
Send correspondence to Dr. Hicks ().

Drs. Hicks and Sevilimedu contributed equally to this article.

Components of the data presented in this article were discussed at the annual meeting of the American Psychiatric Association, May 5–9, 2018, New York City.

This study was supported and conducted by the Cooperative Studies Program (CSP 576), Department of Veterans Affairs, Office of Research and Development. The CSP was involved in the design and conduct of the study; the collection, management, analysis, and interpretation of the data; and the preparation, review, and approval of the manuscript. The CSP had no role in the decision to submit the manuscript for publication. Bristol-Myers Squibb provided aripiprazole (Abilify) for use in this study. Clinicaltrials.gov identifier: NCT01421342.

The opinions expressed in this article are those of the authors and do not necessarily represent the views of the U.S. Department of Veterans Affairs or the U.S. government.

Mr. Johnson owns stock in Bristol-Myers Squibb, where his spouse is an employee. Dr. Davis has received research funding from Tonix and Merck as well as personal consulting fees from Bracket, Janssen, Otsuka, Lundbeck, and Tonix. Dr. Zisook receives funding from Defender Pharmaceuticals and COMPASS Pathways. The other authors report no financial relationships with commercial interests.

The authors thank the local site investigators, independent evaluators, nurse coordinators, and patient participants at the 35 Veterans Affairs Augmentation and Switching Treatments for Improving Depression Outcomes enrollment sites; the CSP Coordinating Center at the VA Connecticut Healthcare System, West Haven, for providing statistical analyses; and A. John Rush for reviewing a draft of the article.

References

1 Kessler RC, Berglund P, Demler O, et al.: Lifetime prevalence and age-of-onset distributions of DSM-IV disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry 2005; 62:593–602Google Scholar

2 Murray CJ, Atkinson C, Bhalla K, et al.: The state of US health, 1990–2010: burden of diseases, injuries, and risk factors. JAMA 2013; 310:591–608Google Scholar

3 Practice Guideline for the Treatment of Patients With Major Depressive Disorder, 3rd ed. Arlington, VA, American Psychiatric Association, 2010. http://www.psychiatryonline.com/pracGuide/pracGuideTopic_7.aspx. Accessed Jan 27, 2019Google Scholar

4 Mohamed S, Johnson GR, Vertrees JE, et al.: The VA augmentation and switching treatments for improving depression outcomes (VAST-D) study: rationale and design considerations. Psychiatry Res 2015; 229:760–770Google Scholar

5 Trivedi MH: Treating depression to full remission. J Clin Psychiatry 2009; 70:e01Google Scholar

6 Trivedi MH, Rush AJ, Wisniewski SR, et al.: Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry 2006; 163:28–40Google Scholar

7 Wagner S, Engel A, Engelmann J, et al.: Early improvement as a resilience signal predicting later remission to antidepressant treatment in patients with major depressive disorder: systematic review and meta-analysis. J Psychiatr Res 2017; 94:96–106Google Scholar

8 Kemp DE, Ganocy SJ, Brecher M, et al.: Clinical value of early partial symptomatic improvement in the prediction of response and remission during short-term treatment trials in 3,369 subjects with bipolar I or II depression. J Affect Disord 2011; 130:171–179Google Scholar

9 Szegedi A, Jansen WT, van Willigenburg AP, et al.: Early improvement in the first 2 weeks as a predictor of treatment outcome in patients with major depressive disorder: a meta-analysis including 6,562 patients. J Clin Psychiatry 2009; 70:344–353Google Scholar

10 Olgiati P, Serretti A, Souery D, et al.: Early improvement and response to antidepressant medications in adults with major depressive disorder. Meta-analysis and study of a sample with treatment-resistant depression. J Affect Disord 2018; 227:777–786Google Scholar

11 Gorwood P, Bayle F, Vaiva G, et al.: Is it worth assessing progress as early as week 2 to adapt antidepressive treatment strategy? Results from a study on agomelatine and a global meta-analysis. Eur Psychiatry 2013; 28:362–371Google Scholar

12 Mohamed S, Johnson GR, Chen P, et al.: Effect of antidepressant switching vs augmentation on remission among patients with major depressive disorder unresponsive to antidepressant treatment: the VAST-D randomized clinical trial. JAMA 2017; 318:132–145Google Scholar

13 Zisook S, Tal I, Weingart K, et al.: Characteristics of US veteran patients with major depressive disorder who require “next-step” treatments: a VAST-D report. J Affect Disord 2016; 206:232–240Google Scholar

14 Kroenke K, Spitzer RL, Williams JB: The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16:606–613Google Scholar

15 Trivedi MH: Tools and strategies for ongoing assessment of depression: a measurement-based approach to remission. J Clin Psychiatry 2009; 70(Suppl 6):26–31Google Scholar

16 Sheehan DV, Lecrubier Y, Sheehan KH, et al.: The Mini-International Neuropsychiatric Interview (MINI): the development and validation of a structured diagnostic psychiatric interview for DSM-IV and ICD-10. J Clin Psychiatry 1998; 59(Suppl 20):22–33Google Scholar

17 Kessler RC, Magee WJ: Childhood adversities and adult depression: basic patterns of association in a US national survey. Psychol Med 1993; 23:679–690Google Scholar

18 Shear KM, Jackson CT, Essock SM, et al.: Screening for complicated grief among Project Liberty service recipients 18 months after September 11, 2001. Psychiatr Serv 2006; 57:1291–1297Google Scholar

19 Posner K, Brown GK, Stanley B, et al.: The Columbia-Suicide Severity Rating Scale: initial validity and internal consistency findings from three multisite studies with adolescents and adults. Am J Psychiatry 2011; 168:1266–1277Google Scholar

20 Beck AT, Epstein N, Brown G, et al.: An inventory for measuring clinical anxiety: psychometric properties. J Consult Clin Psychol 1988; 56:893–897Google Scholar

21 Linn BS, Linn MW, Gurel L: Cumulative Illness Rating Scale. J Am Geriatr Soc 1968; 16:622–626Google Scholar

22 Endicott J, Nee J, Harrison W, et al.: Quality of Life Enjoyment and Satisfaction Questionnaire: a new measure. Psychopharmacol Bull 1993; 29:321–326Google Scholar

23 Rush AJ, Trivedi MH, Ibrahim HM, et al.: The 16-Item Quick Inventory of Depressive Symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): a psychometric evaluation in patients with chronic major depression. Biol Psychiatry 2003; 54:573–583Google Scholar

24 Lin CH, Chen MC, Yang WC, et al.: Early improvement predicts outcome of major depressive patients treated with electroconvulsive therapy. Eur Neuropsychopharmacol 2016; 26:225–233Google Scholar

25 Çiftçi A, Ulaş H, Topuzoğlu A, et al.: Is the ultimate treatment response predictable with early response in major depressive episode? Noro Psikiyatri Arsivi 2016; 53:245–252Google Scholar

26 Martínez-Amorós E, Goldberg X, Gálvez V, et al.: Early improvement as a predictor of final remission in major depressive disorder: new insights in electroconvulsive therapy. J Affect Disord 2018; 235:169–175Google Scholar

27 Lam RW: Onset, time course and trajectories of improvement with antidepressants. Eur Neuropsychopharmacol 2012; 22(Suppl 3):S492–S498Google Scholar

28 Posternak MA, Zimmerman M: Is there a delay in the antidepressant effect? A meta-analysis. J Clin Psychiatry 2005; 66:148–158Google Scholar

29 Taylor MJ, Freemantle N, Geddes JR, et al.: Early onset of selective serotonin reuptake inhibitor antidepressant action: systematic review and meta-analysis. Arch Gen Psychiatry 2006; 63:1217–1223Google Scholar

30 Papakostas GI, Perlis RH, Scalia MJ, et al.: A meta-analysis of early sustained response rates between antidepressants and placebo for the treatment of major depressive disorder. J Clin Psychopharmacol 2006; 26:56–60Google Scholar

31 Szegedi A, Müller MJ, Anghelescu I, et al.: Early improvement under mirtazapine and paroxetine predicts later stable response and remission with high sensitivity in patients with major depression. J Clin Psychiatry 2003; 64:413–420Google Scholar

32 Papakostas GI, Petersen T, Denninger JW, et al.: Psychosocial functioning during the treatment of major depressive disorder with fluoxetine. J Clin Psychopharmacol 2004; 24:507–511Google Scholar

33 Tadić A, Wachtlin D, Berger M, et al.: Randomized controlled study of early medication change for nonimprovers to antidepressant therapy in major depression: the EMC trial. Eur Neuropsychopharmacol 2016; 26:705–716Google Scholar