Primary Image

Multiple Errands Test

Last Updated


The Multiple Errands Test measures how impairments in executive performance affect functioning in natural contexts.

Link to Instrument

Instrument Details

Acronym MET

Area of Assessment

Executive Functioning
Activities of Daily Living
Attention & Working Memory
Occupational Performance
Life Participation

Assessment Type

Performance Measure



Cost Description

VMET requires GestureTek® IREX™ Systems, approximately $15,000

CDE Status

Not a CDE—last searched 8/25/2023.

Key Descriptions

  • Additional versions of the MET have been created in addition to the original, including:
    --Baycrest MET Revised Version (BMET-R, 2007)
    --Multiple Errands Test Revised (MET-R, 2013)
    --VR Version (VMET, 2014)
    --Multiple Errands Test - Contextualized Version (MET-CV, 2018)
    --Home Version (MET-Home, 2019)
    --Big Store Version (2019)
    --Baycrest MET Version (BMET) or Chinese-MET (2020)
  • In all versions, at least 12 tasks must be completed. The MET-R requires task set-up prior to test administration—tutorial videos guide the selection of site-specific tasks. The original version was developed by Shallice and Burgess in 1991. Its complexity necessitated its use with high functioning patients. The MET-R and MET-HV were adapted to be used in a controlled environment, such as hospital grounds or a shopping mall. The VMET is done virtually. Therefore, it can be used with clients who have motor impairments that preclude them from the extensive ambulation needed to complete the assessment. It utilizes a video capture virtual supermarket programmed into a virtual reality system which must be purchased.
  • In all versions, tasks are broken down into sets:
    --Errands (purchasing items, using a telephone and mailing a letter)
    --Obtaining information (closing time of local library, price of a candy bar, finding a specified flyer and keeping it with them until the BMET-R is completed),
    --Meeting the assessor at a designated time and place (meet at gift shop 10 minutes after starting)
    --Informing the assessor when task is completed
  • The participant is required to complete these tasks while adhering to 6-8 rules (depending on the version) such as:
    --Adhering to spending limits with purchases
    --Not purchasing more than a certain number of items
    --Not visiting the same area twice
    --Avoiding restricted areas and communication with others
  • The participant is given the rules before beginning. The tester observes the participant and records behavior during performance. For the hospital version, this is done at a distance.
  • The score is based on the total number of errors recorded and the time it took the participant to complete the assessment. Errors are categorized in several areas: inefficiencies, rule breaks, misinterpretation of task requirements, task failures (not completing any of 12 tasks).
  • Errors are categorized in several areas: inefficiencies, rule breaks, misinterpretation of task requirements, task failures (not completing any of 12 tasks)
  • MET Scores Possible Range: Inefficiencies (0-9), Rule Breaks (0-9), Task Failures (0-12), Total errors (0-30)

Number of Items

12 tasks with 6-8 rules

Equipment Required

  • Wristwatch (with GPS tracking for some versions)
  • Carrier Bag
  • Money (amount specified by location of test)
  • Pen
  • Clipboard
  • Map
  • Test instructions
  • Test packet or Task List
  • Scoring sheet
  • Neuro VR Software and Player (for VMET)

Time to Administer

45-60 minutes

Required Training

No Training

Required Training Description

No required training but two examiners are required to be present during test administration.

Age Ranges


13 - 17



18 - 64


Elderly Adult

65 +


Instrument Reviewers

Mary Elizabeth Patnaude, MS, OTR/L

Galila Dandridge, OT Student, UIC

Luz Rocha, OT Student, UIC

Maya Soto, OT Student, UIC 

Melissa Wen, OT Student, UIC

Susan Magasi, PhD, Associate Professor, Departments of Occupational Therapy and Disability Studies, UIC

Body Part


ICF Domain


Measurement Domain

Participation & Activities
Activities of Daily Living

Professional Association Recommendation

None found—last searched 8/25/2023.


  • Executive functioning must be evaluated in real life settings.
  • There are multiple versions of the Multiple Errands Test.
  • The test requires that the participant can ambulate through a natural context for 60 minutes. The VMET is done virtually and can be used with individuals who have motor impairments.
  • There is minimal research supporting its use with clients who have suffered an acute brain injury or acute stroke since they may be too medically fragile to participate in the assessment.
  • Assessing its reliability and validity has been difficult since the measure must be tailored to the environment in which it is administered.
  • The system in which the virtual mall is acquired, the GestureTek® IREX™ costs approximately $15,000, and may be cost prohibitive for some settings to acquire.


back to Populations

Interrater/Intrarater Reliability

Acute Stroke (MET-R): (Morrison et al, 2013; mCVA group n = 25, Mean Age = 60 (10.8) during acute inpatient stay; control group n = 21, Mean Age = 59.9 (15.50); Mean time post CVA = 6 months)

  • Excellent interrater reliability (ICC = 1) 

Stroke (MET-Home): (Burns et al., 2019; n = 23; Mean Age (SD) = 56.7 (10.6) years; Mean Time Post CVA: ≥ 90 days post-stroke; Mild to Moderate Severity; English speaking)

  • Excellent interrater reliability during MET–Home piloting (ICC ≥ .80)
  • Excellent interrater reliability for subscores (ICC's ranged from .88 to .96) 

Internal Consistency

Chronic Stroke (MET-HV): (Knight et al, 2002)

  • Excellent internal consistency (Cronbach's alpha = 0.77)

Stroke (MET-Home): (Burns et al., 2019) 

  • Acceptable: Cronbach’s alpha ranged from .68 to .74 for each of the items on the MET-Home task list (overall list of 14 items; ⍺ = .73). 

Criterion Validity (Predictive/Concurrent)

Concurrent validity:

Chronic Stroke (MET-HV) (Knight et al, 2002)

  • Adequate correlation with the Behavioral Assessment of Dysexecutive Syndrome Battery (BADS) (r = -.57)

Acute Stroke (MET-R) (Morrison et al, 2013)

  • Adequate correlation between the MET-R task completion score and the Executive Function Performance Test (EFPT) total score (r = -.55). 

Stroke (MET-Home): (Burns et al., 2019)

  • Adequate correlations between the MET-Home Accurately Completed (r = -0.49, p < 0.05) and Omissions (r = 0.46, p < 0.05) subscores and EFPT Total Score.
  • Adequate correlations between MET-Home Accurately Completed (r = 0.48, p < 0.05) and Omissions (r = 0.59, p < 0.01) subscales with EFPT Organization subscale.
  • Adequate correlations between MET-Home Passes (r = 0.48, p < 0.05) and Rule Breaks (r = 0.57, p < 0.01) subscales with EFPT Sequencing subscale.

Predictive validity:

Acute Stroke (MET-HV) (Maier, Kraus & Katz, 2010; n = 30; Mean Age = 53.6 (15.04), Mean Days in Rehabilitation = 54.6 (20.31))

  • Adequate predictive validity of the MET-HV at acute rehabilitation hospital discharge at predicting Participation Index (M2PI) scores at 180 days post discharge (r = 0.4) (Maier et al, 2010)

Construct Validity

Convergent Validity: 

Stroke (MET-Home): (Burns et al., 2019)

  • Adequate convergent validity between the MET-Home subscores for Accurately Completed (p < .05), Inefficiencies (p < .01), and Rule Breaks (p < .05) and the Symbol Digit Modulation Test (SDMT).
  • Adequate convergent validity between the MET-Home subscore for Passes and the Delis-Kaplan Executive Function System (D-KEFS) Tower Test – Rule Violations subscale (p < .05).

Discriminant Validity: 

Chronic Stroke (MET-SV) (Alderman, 2003; Group 1: = 46, mean age = 29.2 (8.5); Group 2 (Neurologically impaired): = 50, mean age 34.6 (12.7), mean time since injury 72.1 (68.4) months.

  • Adequate construct validity of the MET–SV to differentiate participants with chronic stroke/TBI from healthy participants (r = .44)

Acute Stroke (MET-R): (Morrison et al., 2013).

  • Significant differences between control and mCVA groups on MET-R component scores for total tasks completed (p < .001), total number of rule breaks (p < .001), and performance efficiency (p < .002) 



Mixed Populations

back to Populations

Interrater/Intrarater Reliability

Traumatic Brain Injury/Chronic Stroke (MET-HV): (Knight, Alderman & Burgess, 2002; n = 22, Mean Age = 35.6 (11.3) years, Mean Time Post-Injury = 80.9 (62.6) months)

  • Good inter-rater reliability (ICC = .81) for interpretation failures
  • Excellent inter-rater reliability (ICC = 1) for rule breaks

Construct Validity

Convergent validity

People with Substance Dependence (PWSD): (Valls-Serrano et al., 2018;  n = 90; mean age = 35.88 (8.91); PWDS group n = 58; all participants fluent in Spanish; MET-Contextualized Version (CV))

  • MET-CV scales correlated with traditional neuropsychological tasks in polysubstance users: 
    • Adequate convergent validity between Task Failures and Letter Number Sequencing Test (r = -.341, p < .001)
    • Adequate convergent validity between Rules Breaks and Zoo Map Test (r = -.309, p < .001)
    • Poor convergent validity between Interpretation Failures and Stocking of Cambridge Task (r = .273, p < .05)
  • MET-CV scales correlated with drug use variables:
    • Adequate convergent validity between Rule Breaks and abstinence (r = −0.306; p < .001) (Definition of rule breaks = when a specific social rule has been broken; for example, spoke to examiner or argue with staff)
    • Adequate convergent validity between Rule Breaks and the Severity of Consumption Index (r = 0.453, p < .001)
    • Poor convergent validity between Rule Breaks and Interpretation Failures (r = .269, p < .001) (Definition of interpretation failures = where the requirements of a task have been misunderstood; for example, buy products not ordered)

Content Validity

Chronic Stroke/TBI (VMET): (Rand et al, 2009; n = 9, Mean Age = 64.2 (7.7); Time Post-Stroke = 4-72 months)

  • Excellent content validity of the MET (r = -.93) and VMET (r = -.87) with the Zoo Map subtest from the Behavioral Assessment of Dysexecutive Functioning Syndrome (BADS) assessment of executive functioning

Face Validity

Chronic Stroke/TBI: (Rand et al, 2009)

  • Excellent face validity of MET (= -.76) and VMET (= -.82) with measure of independence in IADL.


Scores on modified version of MET significantly improved in the area of Task Failures, after a goal oriented training for executive functioning. This represents minimal evidence of responsiveness (Novakovic-Agopian et al; Poulin et al, 2013). No effect size was reported.

Neurological Disorders

back to Populations

Internal Consistency

Schizophrenia (MET-1991 Version): (Bulzacka et. al., 2016; n = 100; mean age = 30.9 (9.4); mean duration of the illness = 7.7 (6.1) years; schizophrenia group n = 75, schizoaffective disorder group n = 25); 81% male; French Sample) 

  • Adequate: Cronbach's alpha = 0.75 (average inter-item covariance: 13.2)

Criterion Validity (Predictive/Concurrent)

Concurrent validity:

Schizophrenia (MET-1991 Version): (Bulzaka et. al., 2016)

  • Executive Functioning Tests: 
    • Adequate correlations with the Wisconsin Card Sorting Test-64 (WCST)
      • WCST total correct: ρ = 0.3, p = 0.003
      • WCST perseverative: ρ = 0.3, p = 0.003
      • WCST categories completed: ρ = 0.3, p = 0.004).
    • Adequate correlation association with the Paper version Errands Test (PET; ρ = 0.4, p = 0.0001)
  • Functional Outcomes:
    • Adequate correlations with Social Autonomy Scale (SAS), with the MET composite task score and the global error score being the strongest associations (ρ = 0.39; p = 0.0001 for both)

Alzheimer's Disease and Progressive Dementia

back to Populations

Standard Error of Measurement (SEM)

Dementia (mild to moderate) population (Chinese-MET): (Lai et al., 2020; n = 160; mean age = 70.23 (8.2); mild to moderate dementia; All participants fluent in Mandarin)

  • SEM for Dementia group from the Chinese-MET (n = 20; mean age = 69.25 (16.32)): 3.649


Minimal Detectable Change (MDC)

Dementia (mild to moderate) population (Chinese-MET): (Lai et al., 2020)

  • MDC for Dementia group from the Chinese-MET (n = 20; mean age = 69.25 (16.32)): 10.12 (at 95% CI)


Test/Retest Reliability

Dementia (mild to moderate) population (Chinese-MET): (Lai et al., 2020) 

  •  Adequate test-retest reliability: (ICC = .92)

Interrater/Intrarater Reliability

Dementia (mild to moderate) population (Chinese-MET): (Lai et al., 2020) 

  • Excellent: (ICC = .95)

Internal Consistency

Dementia (mild to moderate) population (Chinese-MET): (Lai et al., 2020) 

  • Excellent: Cronbach’s alpha = 0.94

Content Validity

Dementia (mild to moderate) population (Chinese-MET): (Lai et al., 2020) 

“The content validity of the Chinese-MET was assessed by an expert panel that was composed of five health care professionals, including two social workers, a psychiatrist and two occupational therapists. From the results of this panel, Lai et al. (2020) concluded “with clarity of presentation (a mean of 4.09 to 4.83), understandability of instructions by older adults (a mean of 4.03 to 4.76) and relevance to measures of executive function (a mean of 4.36 to 4.59).”

Parkinson's Disease

back to Populations

Standard Error of Measurement (SEM)

Parkinson’s Disease (Virtual MET): (Cipresso, et al., 2014; n = 45, mean age for PD-Normal Cognition (NC) = 69 (8.1), PD-Mild Cognitive Impairment (MCI) = 68.1 (9.4), for Control Group (CG) = 61.7 (5.2)

  • SEM for PD-NC Group (n = 15): 2.806
  • SEM for PD-MCI Group (n = 15): 3.256
  • SEM for CG (n = 15): 1.801

Minimal Detectable Change (MDC)

Parkinson’s Disease (Virtual MET): (Cipresso, et al., 2014)

  • MDC for PD-NC Group (n = 15): 7.78 (95% CI)
  • MDC for PD-MCI Group (n = 15): 9.03 (95% CI)
  • MDC for CG (n = 15): 4.99 (95% CI)

Interrater/Intrarater Reliability

Parkinson’s Disease (Virtual MET): (Cipresso, et al., 2014) 

  • Excellent interrater reliability: (ICC = 0.88)


Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test. Journal of the International Neuropsychological Society, 9, 31-44.

Antoniak, K., Clores, J., Jensen, D., Nalder, E., Rotenberg, S., & Dawson, D. R. (2019). Developing and Validating a Big-Store Multiple Errands Test. Frontiers in psychology, 10, 2575.

Bulzacka E, Delourme G, Hutin V, Burban N, Méary A, Lajnef M, Leboyer M, Schürhoff F. Clinical utility of the Multiple Errands Test in schizophrenia: A preliminary assessment. Psychiatry Res. 2016 Jun 30;240:390-397. doi: 10.1016/j.psychres.2016.04.056. Epub 2016 Apr 24. PubMed PMID: 27138836.

Burns, S. P., Dawson, D. R., Perea, J. D., Vas, A., Pickens, N. D., & Neville, M. (2019). Development, reliability, and validity of the Multiple Errands Test Home Version (MET–Home) in adults with stroke. American Journal of Occupational Therapy, 73, 7303205030.

Cipresso, P., Albani, G., Serino, S., Pedroli, E., Pallavicini, F., Mauro, A., & Riva, G. (2014). Virtual multiple errands test (VMET): A virtual reality-based tool to detect early executive functions deficit in Parkinson’S Disease. Frontiers in Behavioral Neuroscience, 8.

Clark, A. J., Anderson, N. D., Nalder, E., Arshad, S., & Dawson, D. R. (2017). Reliability and construct validity of a revised Baycrest Multiple Errands Test. Neuropsychological rehabilitation, 27(5), 667–684.

Cuberos-Urbano, G., Caracuel, A., Vilar-Lopez, R., Vall-Serrano, C., Bateman, A. & Verdejo-Garcia, A. (2013). Ecological validity of the multiple errands test using predictive models of dysecutive problems in everyday life. Journal of Clinical and Experimental Neuropsychology (35),3, 329-336.

Dawson, D.R., Anderson, N.D., Burgess, P., Cooper, E., Krpan, K.M., & Stuss, D.T. (2009). Further development of the Multiple Errands Test: Standardized scoring, reliability, and ecological validity for the Baycrest version. Archives of Physical Medicine and Rehabilitation, 90, S41-51.

Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings. Neuropsychological Rehabilitation, 12(3), 231-255.

Lai, F. H., Dawson, D., Yan, E. W., Ho, E. C., Tsui, J. W., Fan, S. H., & Lee, A. T. (2020). The validity, reliability and clinical utility of a performance-based executive function assessment in people with mild to moderate dementia. Aging & mental health, 24(9), 1496–1504.

Maier, A., Krauss, S., & Katz, N. (2011). Ecological validity of the Multiple Errands Test (MET) on discharge from neurorehabilitation hospital. Occupational Therapy Journal of Research: Occupation, Participation and Health, 31(1) S38-46.

Morrison, M.T., Giles, G.M., Ryan, J.D., Baum, C.M., Dromerick, A.W., Polatajko, H.J., & Edwards, D.F. (2013). Multiple errands test-revised (MET-R): A performance based measure of executive function in people with mild cerebrovascular accident. American Journal of Occupational Therapy, 67, 460-468.

Novakovic-Agopian, T., Chen, A.J.W., Rome, S., Abrams, G., Castelli, H., Rossi, A., McKim, R., Hills, N., & D’Esposito, M. (2011). Rehabilitation of executive functioning with training in attention regulation applied to individually defined goals: A pilot study bridging theory, assessment, and treatment. The Journal of Health Trauma Rehabilitation, 26(5), 325-338.

Poulin, V., Korner-Bitensky, N., Dawson, D. (2013). Stroke-specific executive function assessment: A literature review of performance-based tools. Australian Occupational Therapy Journal, 60, 3-19.

Rand, D., Rukan, S.B., Weiss, P.L. & Katz, N. (2009). Validation of the virtual MET as an assessment tool for executive functioning. Neuropsychological Rehabilitation,19(4), 583-602.

Shallice, T. & Burgess, P.W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727-741.

Valls-Serrano, C., Verdejo-García, A., Noël, X., & Caracuel, A. (2018). Development of a Contextualized Version of the Multiple Errands Test for People with Substance Dependence. Journal of the International Neuropsychological Society : JINS, 24(4), 347–359.