Primary Image

Patient Health Questionnaire

Patient Health Questionnaire (PHQ-9)

Last Updated

Purpose

The PHQ-9 assesses the presence and intensity of depressive symptoms.

Link to Instrument

Instrument Details

Acronym PHQ-9

Area of Assessment

Depression

Assessment Type

Patient Reported Outcomes

Administration Mode

Paper & Pencil

Cost

Free

Diagnosis/Conditions

  • Brain Injury
  • Cardiac Dysfunction
  • Parkinson's Disease + Neurologic Rehabilitation
  • Spinal Cord Injury
  • Stroke Recovery

Key Descriptions

  • A self-report questionnaire includes items from the original PHQ's mood module.
  • The PHQ-9 was designed to diagnose both the presence of depressive symptoms as well as to characterize the severity of depression.
  • 9 Items are rated based on frequency of occurrence in the past two weeks (responses in shaded areas of the scoresheet are associated with depression – generally scores of 2 or 3):
    0 = not at all
    1 = several days
    2 = more than half the days
    3 = nearly every day
  • A single question rates how difficult problems have made it to do work, take care of things at home or get along with other people using a 4-level scale ranging from not difficult at all to extremely difficult.

Number of Items

9

Time to Administer

1-3 minutes

1 to 3 minutes

Required Training

Reading an Article/Manual

Age Ranges

Adolescent

13 - 17

years

Adult

18 - 64

years

Elderly Adult

65 +

years

Instrument Reviewers

Initially reviewed by Jason Raad, MS and the Rehabilitation Measures Team; Updated with references from the Coronary Heart Disease population by Jon Walmsley, SPT and Mike Weiler, SPT; Updated with references for the TBI population by Erin Donnelly and the TBI EDGE task force of the Neurology Section of the APTA in 2012; Updated with references for Parkinson's Disease, dementia, and stroke by Rachel Mason, SPT and Lauren Nevoral, SPT in 4/2012.

ICF Domain

Body Function

Measurement Domain

Emotion

Professional Association Recommendation

Recommendations for use of the instrument from the Neurology Section of the American Physical Therapy Association’s Multiple Sclerosis Taskforce (MSEDGE), Parkinson’s Taskforce (PD EDGE), Spinal Cord Injury Taskforce (PD EDGE), Stroke Taskforce (StrokEDGE), Traumatic Brain Injury Taskforce (TBI EDGE), and Vestibular Taskforce (Vestibular EDGE) are listed below. These recommendations were developed by a panel of research and clinical experts using a modified Delphi process.

For detailed information about how recommendations were made, please visit:  http://www.neuropt.org/go/healthcare-professionals/neurology-section-outcome-measures-recommendations

Abbreviations:

 

HR

Highly Recommend

R

Recommend

LS / UR

Reasonable to use, but limited study in target group  / Unable to Recommend

NR

Not Recommended

 

Recommendations based on level of care in which the assessment is taken:

 

Acute Care

Inpatient Rehabilitation

Skilled Nursing Facility

Outpatient

Rehabilitation

Home Health

TBI EDGE

LS

R

R

R

R

 

Recommendations for use based on ambulatory status after brain injury:

 

Completely Independent

Mildly dependant

Moderately Dependant

Severely Dependant

TBI EDGE

N/A

N/A

N/A

N/A

 

Recommendations for entry-level physical therapy education and use in research:

 

Students should learn to administer this tool? (Y/N)

Students should be exposed to tool? (Y/N)

Appropriate for use in intervention research studies? (Y/N)

Is additional research warranted for this tool (Y/N)

TBI EDGE

No

Yes

Yes

Not reported

Considerations

Using the PHQ-9 in clinical settings may result in a larger than acceptable number of false-positives because positive predictive value tends to be low (Wittkampf et al., 2009) 

Do you see an error or have a suggestion for this instrument summary? Please e-mail us!

Older Adults and Geriatric Care

back to Populations

Standard Error of Measurement (SEM)

Older Primary Care Patients: 

(Lowe, Unutzer, et al, 2004, n = 434, mean age = 71 (7.4) years, all participants enrolled in the Improving Mood-Promoting Access to Collaborative Treatment (IMPACT), Older Primary Care Patients)

  • SEM for for change due to treatment and no control of prior depression = 2.44 
  • SEM for the the same number of DSM-IV depressive symptoms at both assessments = 1.32

Minimally Clinically Important Difference (MCID)

Older Primary Care Patients:

(Lowe, Unutzer, et al, 2004, Older Primary Care Patients)

  • MCID = 5 points

Test/Retest Reliability

Chronically Ill Elderly Patients:

(Lamars, et al., 2008; = 106; mean age = 71.4 (6.9); 51% diagnosed with DM, Chronically Ill Elderly Patients) 

  • Excellent Test-retest reliability (correlation = 0.91)

 

Older Primary Care Patients:

(Lowe, et al, 2004; at baseline and 6 months, Older Primary Care Patients)

  • Excellent test-retest reliability for change due to treatment and no control of prior depression (ICC = 0.81) 
  • Excellent test-retest reliability for same number of DSM-IV depressive symptoms at both assessments (ICC = 0.96)

Non-Specific Patient Population

back to Populations

Cut-Off Scores

Meta-analysis:

(Gilbody, et al, 2007; 14 validated studies reviewed; n = 5,026 participants, Meta-analysis)

  • Clinicians and researchers may elect to adjust cut-scores in response to clinical population characteristics

 

Coronary Heart Disease:

(Thombs, et al, 2008; n = 1024, Coronary Heart Disease) 

  • Cutoff score of > 6 for Major Depressive Disorder, with 83% sensitivity and 76% specificity

(McManus, et al, 2005; n = 1024, Coronary Heart Disease) 

  • Cutoff score of > 10 for Major Depressive Disorder with 54% sensitivity and 90% specificity

 

Patients with Mental Health or Somatic Complaints:

Wittkampf, et al, 2009; n = 664; mean age = 49.8; Diagnosed as depressed based on Structured Clinical Interview for DSM-IV Axis I Disorders = 12.3%; Dutch sample, Patients with Mental Health or Somatic Complaints) 

  • For screening a cutoff > 10 points demonstrated the highest sensitivity and specificity, however, the PHQ-9 was not found to be specific enough to be used for diagnostic purposes among populations at the highest risk.

 

General Medical Population:(Lowe et al, 2004, n = 501; mean age = 41.7 (13.8) years; participants were recruited from outpatient clinics and 12 family practices; German sample, General Medical Population) 

  • The PHQ-9 was found to be more accurate in diagnosing ‘major depressive disorder’ than either the Hospital Anxiety and Depression Scale (HADS) or the Well Being Index (WBI-5) 
  • Suggested cut-off to indicate depression is PHQ-9 > 11 

Test/Retest Reliability

Primary Care Patients:

(Zuithoff, et al., 2010; n = 1338; mean age = 51, Primary Care Patients) 

  • Excellent Test-retest reliability (Correlation = 0.94) 

 

Primary Care Patients:

(Kroenke, et al, 2001; n = 3000; mean age = 46 (17) years; most common medical conditions = Hypertension (25%) and Arthritis (11%), Primary Care Patients)

  • Excellent test-retest reliability over a 48 hour period (r = 0.84)

Internal Consistency

Coronary Heart Disease:

(Stafford, et al, 2007; = 193, Coronary Heart Disease)

  • Excellent internal consistency (Chronbach's alpha = 0.90) 

 

Outpatients:

(Lowe, Spitzer, et al, 2004; n = 501; mean age = 41.7 (13.8) years; German sample, Outpatients)

  • Excellent internal consistency (Cronbach’s alpha = 0.88)

 

Primary Care Patients:

(Kroenke, et al, 2001, Primary Care Patients)

  • Excellent internal consistency (Chronbach's alpha > 0.86)

 

(Zuithoff, et al. 2010, Primary Care Patients) 

  • Excellent internal Consistency (ICC = 0.88)

Criterion Validity (Predictive/Concurrent)

Primary Health Care (with Full PHQ):

(Löwe et al 2004, n = 501, mean age = 41.7 (13.8) years; German sample); common health complaints included: diseases of the musculoskeletal system and connective tissue (21%), endocrine, nutritional and metabolic diseases (16%), cardiovascular/circulatory diseases (10%), Primary Health Care (with Full PHQ))

 

Concurrent Validity Using the International Diagnostic Checklists (IDCL) for ICD-10 to Assess Severity of Depression:

 

 

 

 

 

 

 

PHQ

 

HADS

 

WBI-5

 

Level of Depression

mean

SD

mean

SD

mean

SD

No depressive episode

6.6

4.7

5.4

3.8

13.0

5.7

Mild depressive episode

13.9

5.2

10.1

3.3

6.7

4.2

Moderate depressive episode

17.8

4.7

12.7

4.0

3.6

2.6

Severe depressive episode

19.2

3.9

14.3

4.3

3.6

2.9

PHQ = (full) Patient Health Questionnaire
HADS = Hospital Anxiety and Depression Scale
WBI-5 = WHO Well-Being Index 5

 

 

 

 

 

 

Construct Validity

Coronary Heart Disease:

(Stafford et al, 2007; n = 193, Coronary Heart Disease)

  • Moderate correlation between PHQ-9 and Hospital Anxiety Depression Scale with cutoff score > 6 for major depressive disorder (= 0.72)

 

General Population:

(Martin, et al; 2006; mean age = 48.8 (18.1); 53% female and 47% male, General Population) 

  • Excellent correlations between PHQ-9 and BDI (= 0.73, p < 0.001) 
  • Excellent correlations between PHQ-9 and GHQ-12 (r = 0.59, p < 0.001) 

 

Patients with Mental Health or Somatic Complaints:

(Wittkampf et al, 2009, Patients with Mental Health or Somatic Complaints)

  • Adequate correlations between PHQ-9 total score & Hamilton Depression Rating Scale (HDRS-17) score (r = 0.52, two tailed P < 0.01)

Content Validity

  • The PHQ-9 is derived from the Primary Care Evaluation of Mental Disorders interview schedule which utilizes criteria set forth in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) to diagnose depression

  • Rasch analysis suggests the PHQ-9 is a unidimensional measure of depression (Williams, et al, 2009)

  • The PHQ-9 has been found to be accurate in medical settings (Thompson et al, 2011)

Responsiveness

Depression:

(Lowe, et al, 2006; n = 1,788 (diagnosed with: Major Depressive Disorder (MDD) n = 757, minor depression n= 543, other depressive disorders  n = 438 patients); Mean age 50.3 (14.7) years, Depression)

  • On the 0 to 27 scale, the mean change of for this population was 10.3 (5.6) points over a 12 week period.

Responsiveness of the PHQ-9 to Sertraline (antidepressant) Treatment:

 

 

 

 

 

 

 

 

 

Baseline

 

12 week follow-up

 

 

 

 

n

mean

SD

mean

SD

Strength

Effect Size*

Total sample

1788

16.04

4.87

5.76

4.20

Large

-1.85

Comorbid physical illness (Yes)

535

16.00

4.54

6.75

4.43

Large

-1.71

Comorbid physical illness (No)

1253

16.06

5.00

5.33

4.02

Large

-1.93

*Standardized response mean

 

 

 

 

 

 

 

Alzheimer's Disease and Progressive Dementia

back to Populations

Cut-Off Scores

Dementia:

(Hanock et al, 2009; = 113; mean age = 69 (11.7) years, Dementia)

  • Cut off point of 9 which coincides with the threshold between mild and moderate depression

Normative Data

Dementia:

(Hanock et al, 2009, Dementia)

  • Demented group PHQ-9 scores:
    • Mode = 0
    • Median = 2
    • Mean = 4.1 +/- 5.4
  • Non demented group PHQ-9 scores:
    • Mode = 0 
    • Median = 3.5 
    • Mean = 7.8 +/- 7.9

Criterion Validity (Predictive/Concurrent)

Dementia:

(Hanock et al, 2009, Dementia)

  • Area under the receive operating characteristic curve (ROC) = 0.63
  • Shows that this test has poor diagnostic accuracy

Floor/Ceiling Effects

Dementia:

(Hanock et al, 2009, Dementia)

  • Floor effects = 30%, participants reported no depressive symptoms

Parkinson's Disease

back to Populations

Cut-Off Scores

Parkinson’s Disease:

(Williams et al, 2012; = 229; mean age = 66.0 (10.8) years, Parkinson's Disease)

  • Cut off score of ≥6 for PHQ-9, with 66% sensitivity and 80% specificity 

(Thompson et al, 2011; = 214; mean age = 72.5 (9.6) years, Parkinson's Disease)

  • Cut off score of ≥ 5 for major depression for PHQ-9 
  • Cut off score of 2-4 for minor depression for PHQ-9
  •  

Normative Data

Parkinson’s Disease:

(Williams et al, 2012, Parkinson's Disease)

  • Active depressive disorder is probable with PHQ-9 scores of 8.9 (5.2)
  • No active depressive disorder is probable with a PHQ-9 score of 3.8 (3.8)

Interrater/Intrarater Reliability

Parkinson’s Disease:

(Thompson et al, 2011, Parkinson's Disease)

  • Adequate interrater reliability 95%CI = 0.4 (0.26, 0.54) between PHQ-9 and SCID

Internal Consistency

Parkinson’s Disease:

(Williams et al, 2012, Parkinson's Disease)

  • Excellent internal consistency (Cronbach’s alpha = 0.85)

Stroke

back to Populations

Cut-Off Scores

Stroke:

(de Man-Van Ginkel et al, 2012; = 55; mean age = 65.09 (15.03) years; mean time since stroke onset = 59.82 (77.43) days, Stroke)

  • The optimum cut-off value for the PHQ-9 was 10 with 100% sensitivity and 86% specificity 

(de Man-Van Ginkel et al, 2012; = 164; mean age = 70.6 (13.99) years; mean time since stroke onset = 6.7 (0.9) weeks, Stroke)

  • Accuracy of the PHQ-9 was best at a cutoff score of greater than or equal to 10 with a sensitivity of 0.80 and specificity of 0.78

Test/Retest Reliability

Stroke:

(de Man-Van Ginkel et al, 2012, Stroke)

  • Excellent test-retest reliability for the agreement between the pairs of nurses on the sum score level (ICC = 0.98)

Interrater/Intrarater Reliability

Stroke:

(de Man-Van Ginkel et al, 2012, Stroke)

  • Excellent interrater reliability (ICC = 0.98)

Internal Consistency

Stroke:

(de Man-Van Ginkel et al, 2012, Stroke)

  • Excellent internal consistency (Chronbach’s alpha = 0.79)

Criterion Validity (Predictive/Concurrent)

Stroke:

(de Man-Van Ginkel et al, 2012, Stroke)

  • Concurrent validity between PHQ-9 and Geriatric Depression Scale (GDS-15) was excellent with r = 0.7 and P < 0.001 
  • Area under the receive operating characteristic curve (ROC) = 0.87 (95%CI 0.80-0.93)
  • Discriminatory power for the PHQ-9 was adequate

Brain Injury

back to Populations

Cut-Off Scores

Traumatic Brain Injury:

(Fann, et al, 2005; n = 478; mean age = 42 (17.9) years; mean time since TBI = 3.8 (2.8) months, TBI) 

  • PHQ-9 cutoff > 12 was the best screening criteria for Major Depressive Disorder (MDD) 

(Cook, et al. 2011; = 365; mean age: 43 (17.7); 1 year post-injury, TBI) 

  • Minimal: 0-4 
  • Mild: 5-9 
  • Moderate: 10-14 
  • Moderately severe: 15-19 
  • Severe: >20

Test/Retest Reliability

Traumatic Brain Injury:

(Fann, et al, 2005, TBI)

  • Excellent test-retest reliability within 7 of initial assessment (r = 0.76)

Construct Validity

Traumatic Brain Injury:

(Fann, et al, 2005, TBI)

  • Excellent convergent validity between the PHQ-9 and SCL-20 (Hopkins Symptom Checklist depression subscale; r = 0.90)
  • Excellent convergent validity between the PHQ-9 and HAM-D (Hamilton Rating Scale for Depression; r= 0.78)
  • Excellent discriminant validity between the PHQ-9 and SCL-20 (r = 0.84)
  • Excellent discriminant validity between the PHQ-9 and HAM-D (= 0.67, p < 0.001)

Spinal Injuries

back to Populations

Normative Data

SCI:

(Krause, et al, 2009; n = 727, 53.3% had cervical injuries; mean age 47.9 years, time since injury 18.2 years, Chronic SCI)

  • Mean PHQ-9 score = 5.57 (5.74)
  • Mean Older Adult Health and Mood Questionnaire (OAHMQ) score =  6.0 (5.0)

(Bombardier, et al, 2004; Norms based on Kroenke, et al. 2001, Chronic SCI)

  • Mean PHQ-9 score = 5.48 (95% CI = 5.07–5.88)
  • Major Depressive Disorder (MDD) is probable with PHQ-9 scores > 18.1 (3.9)

PHQ-9 SCI Norms:

 

Diagnostic Catagory

Total Score

No depressive symptoms

0

Minimal depressive symptoms

1 to 4

Mild depressive symptoms

5 to 9

Moderate depressive symptoms

10 to 14

Moderate/severe depressive symptoms

15 to 19

Severe depressive symptoms

20 to 27

Internal Consistency

SCI:

(Bombardier, et al, 2004, Chronic SCI)

  • Excellent internal consistency (Chronbach's alpha = 0.87)
  • First assessment completed by the patient, while the second was conducted over the phone 48 hours later

Construct Validity

SCI:

(Krause et al, 2009, Chronic SCI)

  • Excellent correlations between the PHQ-9 and Older Adult Health and Mood Questionnaire (OAHMQ) scale (r = 0.78)
  • Adequate correlations between PHQ-9 & prevalence of Major Depressive Disorder (r = 0.530)

 

(Bombardier et al, 2004, Chronic SCI)

  • Adequate convergent validity between PHQ-9 scores and:
    • Life satisfaction (r = -0.51, p < 0.001) 
    • Subjective health (r = -0.50, p < 0.001)

Floor/Ceiling Effects

SCI:

(Williams, et al, 2009; n = 202; mean age = 42.6 (13.9) years; > 1 (Range = 1 to 44) years post injury, Chronic SCI) 

  • 22% of participants reported no depressive symptoms

Bibliography

Bombardier, C. H., Richards, J. S., et al. (2004). "Symptoms of major depression in people with spinal cord injury: implications for screening." Arch Phys Med Rehabil 85(11): 1749-1756. Find it on PubMed

Cook, K. F., Bombardier, C. H., et al. (2011). "Do somatic and cognitive symptoms of traumatic brain injury confound depression screening?" Archives of Physical Medicine and Rehabilitation 92(5): 818-823. Find it on PubMed

de Man-van Ginkel, J. M., Gooskens, F., et al. (2012). "Screening for poststroke depression using the patient health questionnaire." Nurs Res 61(5): 333-341. Find it on PubMed

Fann, J. R., Bombardier, C. H., et al. (2005). "Validity of the Patient Health Questionnaire-9 in assessing depression following traumatic brain injury." Journal of Head Trauma Rehabilitation 20(6): 501-511. Find it on PubMed

Gilbody, S., Richards, D., et al. (2007). "Screening for depression in medical settings with the Patient Health Questionnaire (PHQ): a diagnostic meta-analysis." J Gen Intern Med 22(11): 1596-1602. Find it on PubMed

Hancock, P. and Larner, A. (2009). "Clinical utility of Patient Health Questionnaire-9 (PHQ-9) in memory clinics." International Journal of Psychiatry in Clinical Practice 13(3): 188-191.

Janneke, M., Hafsteinsdóttir, T., et al. (2012). "An Efficient Way to Detect Poststroke Depression by Subsequent Administration of a 9-Item and a 2-Item Patient Health Questionnaire." Stroke 43(3): 854-856.

Klaiberg, A. and Braehler, E. (2006). "Validity of the brief patient health questionnaire mood scale (PHQ-9) in the general population." General hospital psychiatry 28: 71-77.

Krause, J. S., Saunders, L. L., et al. (2009). "Comparison of the Patient Health Questionnaire and the Older Adult Health and Mood Questionnaire for self-reported depressive symptoms after spinal cord injury." Rehabil Psychol 54(4): 440-448. Find it on PubMed

Kroenke, K., Spitzer, R., et al. (2001). "The PHQ-9: validity of a brief depression symptom severity measure." Journal of general internal medicine 16(9): 606-613. Find it on PubMed

Lamers, F., Jonkers, C., et al. (2008). "Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients." Journal of clinical epidemiology 61(7): 679-687.

Löwe, B., Gräfe, K., et al. (2004). "Diagnosing ICD-10 depressive episodes: superior criterion validity of the Patient Health Questionnaire." Psychotherapy and psychosomatics 73(6): 386-390.

Lowe, B., Schenkel, I., et al. (2006). "Responsiveness of the PHQ-9 to Psychopharmacological Depression Treatment." Psychosomatics 47(1): 62-67. Find it on PubMed

Löwe, B., Spitzer, R., et al. (2004). "Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses." Journal of Affective Disorders 78(2): 131-140.

Lowe, B., Unutzer, J., et al. (2004). "Monitoring depression treatment outcomes with the patient health questionnaire-9." Med Care 42(12): 1194-1201. Find it on PubMed

McManus, D., Pipkin, S. S., et al. (2005). "Screening for depression in patients with coronary heart disease (data from the Heart and Soul Study)." American Journal of Cardiology 96(8): 1076-1081. Find it on PubMed

Stafford, L., Berk, M., et al. (2007). "Validity of the Hospital Anxiety and Depression Scale and Patient Health Questionnaire-9 to screen for depression in patients with coronary artery disease." Gen Hosp Psychiatry 29(5): 417-424. Find it on PubMed

Thombs, B. D., Ziegelstein, R. C., et al. (2008). "Optimizing detection of major depression among patients with coronary artery disease using the patient health questionnaire: data from the heart and soul study." J Gen Intern Med 23(12): 2014-2017. Find it on PubMed

Thompson, A. W., Liu, H., et al. (2011). "Diagnostic accuracy and agreement across three depression assessment measures for Parkinson's disease." Parkinsonism Relat Disord 17(1): 40-45. Find it on PubMed

Williams, J. R., Hirsch, E. S., et al. (2012). "A comparison of nine scales to detect depression in Parkinson disease: which scale to use?" Neurology 78(13): 998-1006. Find it on PubMed

Williams, R. T., Heinemann, A. W., et al. (2009). "Improving measurement properties of the Patient Health Questionnaire-9 with rating scale analysis." Rehabil Psychol 54(2): 198-203. Find it on PubMed

Wittkampf, K., van Ravesteijn, H., et al. (2009). "The accuracy of Patient Health Questionnaire-9 in detecting depression and measuring depression severity in high-risk groups in primary care." Gen Hosp Psychiatry 31(5): 451-459. Find it on PubMed

Zuithoff, N. P., Vergouwe, Y., et al. (2010). "The Patient Health Questionnaire-9 for detection of major depressive disorder in primary care: consequences of current thresholds in a crosssectional study." BMC Fam Pract 11(1): 98. Find it on PubMed

Save now, read later.