Show Summary Details
Page of

Initial Severity and Antidepressant Benefits: A Meta-Analysis of Data Submitted to the Food and Drug Administration 

Initial Severity and Antidepressant Benefits: A Meta-Analysis of Data Submitted to the Food and Drug Administration
Initial Severity and Antidepressant Benefits: A Meta-Analysis of Data Submitted to the Food and Drug Administration

Michael Maksimowski

, and Zheala Qayyum

Page of

PRINTED FROM OXFORD MEDICINE ONLINE ( © Oxford University Press, 2016. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in Oxford Medicine Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: null; date: 23 October 2019

Drug–placebo differences in antidepressant efficacy increase as a function of baseline severity, but are relatively small even for severely depressed patients.

Kirsch et al.1

Research Question:

What are the drug–placebo differences among antidepressants when both published and unpublished data are analyzed? Does antidepressant efficacy depend on the severity of initial depression scores?



Year Study Began:

Included studies from 1985 to 2007

Year Study Published:


Study Location:

Various clinical sites internationally

Who Was Studied:

Data from all double-blind, placebo-controlled randomized control trials submitted to the FDA for licensing of fluoxetine, venlafaxine, nefazodone, and paroxetine. All patients had been diagnosed with unipolar major depressive disorder using DSM-IV criteria.

Who Was Excluded:

Randomized controlled trials that did not have mean improvement scores or improvement data for all trials of the medication

How Many Participants:


Study Overview:

See Figure 26.1 for an example of a typical study included in meta-analysis.

Figure 26.1 Example of a Typical Study Included in the Meta-Analysis

Figure 26.1 Example of a Typical Study Included in the Meta-Analysis

aOne study was overlapping.

Study Design:

The investigators requested from the FDA all publicly releasable information about clinical trials for fluoxetine, venlafaxine, nefazodone, and paroxetine. The studies included in the meta-analysis were not all peer-reviewed or published. A meta-analysis was conducted that included the baseline severity based on the Hamilton Rating Scale for Depression (HRSD).

Depression severity was assessed in all studies using the criteria proposed by the American Psychiatric Association (APA)2 and adopted by the National Institute for Clinical Excellence (NICE).3 The HRSD score can range from 0 to 54. Scores between 8 and 13 indicate mild depression, scores between 14 and 18 indicate moderate depression, scores between 19 and 22 indicate severe depression, and scores greater than 22 indicate very severe depression. Baseline depression scores were in the very severe range for all but 1 study. Separate meta-analyses were conducted with and without this study, with no difference in results.


Follow-up ranged from four to eight weeks among included studies.


Absolute difference in depression between the placebo and treatment groups as well as extent to which drug-related effects are a function of initial depression severity. A clinically significant change in HRSD was defined as a three-point drop based on criteria proposed by NICE.4


  • HRSD scores were statistically greater in the drug groups versus the placebo groups, but only in patients with very severe depression and not by a three-point difference, which is the criterion for clinical significance used by NICE.

  • Patients with mild or moderate depression scores at baseline did not improve significantly with drug compared to placebo.

  • The researchers attribute clinical significance between drug and placebo groups for those with very severe depression to a decreased responsiveness to placebo (compared to less depressed patients) rather than an increased responsiveness to the antidepressant.

  • Improvements in depression scores among patients in placebo groups were approximately 80% of the improvement observed in the drug groups; the investigators contrasted this finding to the effect of placebo on pain, which is approximately 50% of the response in active treatment groups.4,5,6 This suggests that among patients with depression who improve following treatment, a relatively small proportion of the improvement may be attributable to the active therapy (Table 26.1).

Table 26.1 Summary of Primary Outcomes




P value

HSRD reduction after treatment




note: HSRD = Hamilton Rating Scale for Depression.

Criticisms and Limitations:

This is a meta-analysis, and there were likely differences between studies that were included (e.g., concordant psychotherapy, duration, and dose of treatment). However, the study did test for interactions of these types of variables (assessed as possible moderator variables) and were deemed to have no significant effect on the final results.

Heterogeneity (i.e., variation in results), was low to moderate and moderate for the drug and placebo groups, respectively, with statistically significant differences in heterogeneity between drug and placebo groups. Thus, the mean treatment effects may not accurately describe the results.

Furthermore, findings from this study cannot be generalized to other antidepressants that were not included (e.g., sertraline, citalopram, escitalopram, bupropion, duloxetine).

Finally, 12 randomized controlled trials were not included in the final meta-analysis due to unavailability of the data.

Other Relevant Studies and Information:

  • A repeat meta-analysis of antidepressant efficacy data was conducted after this publication by Fountoulakis and Möller.7 That study reported a higher drug–placebo difference and higher mean improvements in HRSD scores than the study by Kirsch et al. Antidepressant efficacy was not found to relate to depression severity.

  • Three of the original authors of the Kirsch et al. study published a response article8 to Fountoulakis and Möller’s paper, arguing that there were methodological flaws in their analysis.

  • A previous analysis that did not take into account depression severity found a small, clinically insignificant improvement in depression in those treated with antidepressants versus placebo.9

  • A study by Hansen and colleagues10 found second-generation antidepressants (selective serotonin reuptake inhibitors [SSRIs], serotonin–norepinephrine reuptake inhibitors, norepinephrine–dopamine reuptake inhibitors) do not appear to differ substantially in efficacy relative to first-generation antidepressants (e.g., tricyclic antidepressants).

Summary and Implications:

This large meta-analysis involving data from both published and unpublished data submitted to the FDA found that fluoxetine, paroxetine, venlafaxine, and nefazodone produce only modest benefits with respect to depression scores that do not meet the clinically significant improvements defined by the NICE criteria. Improvements among the subgroup of most severely depressed patients are, however, larger and do meet the clinically significant improvements defined by NICE. The results also show that patients with depression receiving placebo therapy improve substantially, suggesting that among patients who improve following treatment, a relatively small proportion of the improvement may be attributable to the active therapy. Psychiatrists should consider this finding when prescribing SSRIs while remaining mindful of individual patient presentations and needs.

Clinical Case: Prescribing Antidepressants

Case History

A 25-year-old woman presents to an outpatient clinic complaining of low mood, anhedonia, weight loss, and insomnia. She has had difficulty getting out of bed in the morning and is in danger of losing her job. She scores a 20 on the HRSD and is diagnosed with major depressive disorder. Informed consent is obtained to start a trial of fluoxetine.

Based on the results from the study by Kirsch and colleagues, describe the efficacy of the medication choice.

Suggested Answer

The study by Kirsch and colleagues found that fluoxetine, paroxetine, venlafaxine, and nefazodone did not produce clinically significant improvements in depression according to NICE criteria except for the most severely depressed patients (i.e., those with a HRSD greater than 28). The patient in the vignette meets criteria for moderate depression, and, on average, we would expect there to be a clinically significant improvement. However, the difference between the observed improvements on a true antidepressant compared to placebo may not be clinically significant.


1. Kirsch. I., Deacon. B. J., Huedo-Medina. T. B., Scoboria. A., Moore. T. J., & Johnson. B. T. (2008). Initial severity and antidepressant benefits: A meta-analysis of data submitted to the Food and Drug Administration. PLoS Medicine, 5(2), e45.Find this resource:

2. Task Force for the Handbook of Psychiatric Measures. (2000). Handbook of psychiatric measures. Washington, DC: American Psychiatric Association.Find this resource:

    3. National Institute for Clinical Excellence. (2004). Depression: Management of depression in primary and secondary care. Clinical practice guideline no. 23. London: National Institute for Clinical Excellence.Find this resource:

      4. Evans, F. J. (1974). The placebo response in pain reduction. Advances in Neurology, 4, 289–296.Find this resource:

        5. Benedetti, F., Arduino, C., & Amanzio, M. (1999). Somatotopic activation of opioid systems by target-directed expectations of analgesia. Journal of Neuroscience, 19, 3639–3648.Find this resource:

        6. Evans, F. G. (1985). Expectancy, therapeutic instructions, and the placebo response. In L. White, B. Tursky, & G. E. Schwartz (Eds.), Placebo: Theory, research and mechanisms (pp. 215–228). New York: Guilford.Find this resource:

          7. Fountoulakis, K. N., & Möller, H. (2011). Efficacy of antidepressants: A re-analysis and re-interpretation of the Kirsch data. International Journal of Neuropsychopharmacology, 14(3), 405–412.Find this resource:

          8. Huedo-Medina, T. B, Johnson, B. T., & Kirsch, I. (2012). Kirsch et al.’s (2008) calculations are correct: Reconsidering Fountoulakis & Möller’s re-analysis of the Kirsch data. International Journal of Neuropsychopharmacology, 15(8), 1193–1198.Find this resource:

          9. Kirsch, I., Moore, T. J., Scoboria, A., & Nicholls, S. S. (2002). The emperor’s new drugs: An analysis of antidepressant medication data submitted to the U.S. Food and Drug Administration. Prevention & Treatment, 5(1), art. 23.Find this resource:

          10. Hansen, R. A., Gartlehner, G., Lohr, K. N., Gaynes, B. N., & Carey, T. S. (2005). Efficacy and safety of second-generation antidepressants in the treatment of major depressive disorder. Annals of Internal Medicine, 143, 415–426.Find this resource: