of0
Export
NINDS CDE Notice of Copyright
Stroop Test
Availability
Please visit this website for more information about the instrument: Stroop Test
A commonly used version is the Delis-Kaplan Executive Function System (D-KEFS) Color-Word Interference Test (CWIT). The CWIT consists of the three traditional Stroop trials (color naming, color name reading, interference) as well as a fourth trial in which the subject switches back and forth between naming the dissonant ink colors and reading the conflicting color names. The stimulus booklet and forms are copyrighted and included as part of the D-KEFS test kit, but can be purchased separately from the test publisher (Stroop Color and Word Test).
Classification
Supplemental – Highly Recommended: Myalgic encephalomyelitis/Chronic fatigue syndrome (ME/CFS)
 
Supplemental: Huntington’s Disease (HD), Multiple Sclerosis (MS), Sport-Related Concussion (SRC), and Stroke
Short Description of Instrument
The Stroop test involves three trials. In the WORD trial, the subject reads words of color names (e.g., red, blue) printed in black ink. In the COLOR trial, the subject identifies colors (e.g., rectangles printed in red or blue). Finally, in the COLOR-WORD response inhibition trial, the subject must name the color in which a word is presented, while ignoring the printed word. Thus, incongruence between the word’s color and identity (e.g., the word “blue” presented in red) requires inhibition and response selection. Multiple versions of the Stroop test are available (e.g., Victoria, Golden, D-KEFS, and Trenerry versions).
 
Construct measured: Cognitive flexibility, attention, and processing speed
 
Generic vs. disease specific: Generic
 
Means of administration (paper and pencil, computerized): Paper and Computerized
 
Location of administration (clinic, home, telephone): Clinical and Research Settings
 
Intended respondent (patient, caregiver): Patient
 
# of items: N/A
 
# of subscales and names of sub-scales: N/A
Comments/Special Instructions
Measurements: Type of scale used to describe individual items and total/subscale scores (nominal, ordinal, or [essentially] continuous): Continuous.
If ordinal or continuous, explain if ceiling or floor effects are to be expected if the measure is used in specific HD Subgroups. No floor effects. Ceiling effects can be avoided if any subjects who reach the end of the page before the allotted time has elapsed are redirected to the top row and continue working until the end of the allotted time-period. Individuals with advanced disease may struggle with the interference trial.
 
Huntington’s Disease-Specific:
The UHDRS version of the Stroop task has been most commonly used in HD research. To date, no one version of the Stroop Tests has been shown to be clearly superior to others. Intended use of instrument/ purpose of tool (cross-sectional, longitudinal, diagnostic): Assessment of cognitive function in HD cross-sectional and longitudinal studies. Sensitivity to Change/ Ability to Detect Change: (over time or in response to an intervention): In published cross-sectional (Stout et al., 2011) and internal analyses (PREDICT-HD), the test is sensitive to changes in premanifest HD, especially in individuals who are closer to an expected diagnosis.  Unpublished internal analyses of 7-year longitudinal data (PREDICT) also shows changes in rates of change over time in premanifest HD on all subtests, especially color and word naming.
 
The TRACK-HD study In a cross-sectional analysis of the Stroop WORD found that healthy controls performed significantly better on the than both the early HD and the premanifest HD groups. Longitudinally, the TRACK-HD study found significant differences in rates of change for early HD compared to controls, but did not find significant differences in rates of change for premanifest HD compared to controls.
 
In Stroop WORD, the TRACK-HD premanifest participants may be less likely to show cognitive effects than the PREDICT-HD Premanifest participants because: (1) they are further from estimated onset based on CAG repeat length and age (Langbehn et al., 2004) and (2) they are potentially less progressed because the TRACK-HD study excluded premanifest subjects based on UHDRS motor scores >= 5. In general, cognitive tests will be more effective metrics in studies of premanifest HD when the focus is on subjects that are close to onset.
 
Meta-analysis of HD observational studies published 1993-2007 reveals both cross sectional performance differences compared to healthy controls and longitudinal change within HD groups over time for Stroop Reading and Stroop Color that is evident in both premanifest and Early HD. The Stroop Interference findings are less impressive, with smaller cross sectional effect sizes and no significant longitudinal effects (see below).
Scoring
Scoring (include reference to detailed scoring instructions, including calculation of a total score and subscale scores, and any limitations of scale or scoring posed by item no response): Scoring for each trial type is based on the number of correct responses in a fixed amount of time, typically within 45 seconds (Golden, 1975). Higher scores indicate better cognitive performance.
 
Standardization of scores to a reference population (z scores, T scores): Raw scores can be converted to t scores for different ranges of age and years of education, depending on norms used. Studies reporting raw scores should control for age and education.
 
If scores have been standardized to a reference population, indicate frame of reference for scoring (general population, HD subjects, other disease groups). General population (5-90 years of age; education levels of 2 to 20 years).
Psychometric Properties
Reliability: High reliability across different versions.
 
Test-retest or intra-interview (within rater) reliability (as applicable): Test-retest reliabilities cover periods of 1 minute to 10 days. Reliabilities for Word, Color, and Color-Word are respectively .88, .79 and .71 (Jensen, 1965) and .89, .84., and .73 (Golden, 1975).
 
Inter-interview (between-rater) reliability (as applicable):
Internal consistency: Correlations among the subtests are moderate to high (.71 to .84) (Chafetz and Mathew, 2004).
 
Statistical methods used to assess reliability: Intraclass correlations Reliability data from the CAB study will be available for the Stroop Word condition of this task by end of 2012 for 100 control, 100 premanifest, and 50 early HD subjects.
 
Construct validity: The interference score correlates well with measures of attention and prepotent response inhibition (May and Hasler, 1998)
 
Known Relationships to Other Variables (e.g., gender, education, age): May not be valid in color-blind individuals. The color-word interference score is vulnerable to aging (Mitrushina et al., 2005). Age and education should be controlled if reporting raw scores.
 
Diagnostic Sensitivity and Specificity, if applicable (in general population, HD population- premanifest/ manifest, other disease groups):
 
Cross-Sectional sensitivity in PreHD
(Group: Effect Size, P value, # of studies/ total # of HD participants across studies) Cross-Sectional sensitivity in HD
(Group: Effect Size, P value, # of studies/total # of HD participants across studies) Longitudinal sensitivity within subjects
(Group: Effect Size, P value, # of studies/ total # of HD participant across studies)
Stroop Reading
All Pre: -0.44, 0.001, 13/242;
Near Pre:-0.65, 0.001, 4/152 Early: -1.29, <0.001, 10/220 Dx: -0.65, 0.022, 4/115;
Near Pre: -0.61, <0.001, 2/160;
All Pre: -0.47, <.003, 4/180
Stroop Colour
All Pre: -0.44, 0.002, 14/260;
Near Pre:  -0.87, 0.001, 4/152 Early: -1.35, <0.001, 9/207 Dx: -0.79, 0.008, 3/102;
Near Pre: -0.44, 0.001, 2/160;
All Pre: -0.34, 0.001, 4/180
Stroop Interference
All Pre: -0.24, 0.065, 18/332;
Near Pre:  -0.64, 0.004, 5/158 Early: -1.09, <0.001, 10/184 Dx: -0.15, 0.108, 4/115;
Near Pre: -0.3, 0.215, 2/159;
All Pre: 0, .999, 5/212
Rationale/Justification
Strengths: The color and word subtest are particularly sensitive in cross-sectional and longitudinal studies of premanifest and early manifest HD. Task has been tested at sites in the United States, Canada, United Kingdom, Australia, Germany, and Spain. Task is easy to administer. Well established neuropsychological test measure with some literature in mild TBI and sport concussion.
 
Weaknesses: N/A
 
Special Requirements for administration: A stopwatch is required.
 
Administration Time: Assessment takes approximately 2 minutes for each of the three trial types.
 
Translations available: Spanish (Golden Version), Cantonese (Victoria Version). The UHDRS version is available in several European languages including: Czech, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Polish, Portuguese, Spanish and Swedish.
 
Psychometric Properties: Reliability: High reliability across different versions.
 
Test-retest or intra-interview (within rater) reliability (as applicable): Test-retest reliabilities cover periods of 1 minute to 10 days. Reliabilities for Word, Color, and Color-Word are respectively .88, .79 and .71 (Jensen, 1965) and .89, .84., and .73 (Golden, 1975).
 
Inter-interview (between-rater) reliability (as applicable):
Internal consistency: Correlations among the subtests are moderate to high (.71 to .84) (Chafetz and Mathew, 2004).
 
Statistical methods used to assess reliability: Intraclass correlations
Diagnostic Sensitivity and Specificity, if applicable (in general population, HD population- premanifest/ manifest, other disease groups).
References
Golden, CJ. Stroop Color and Word Test: A Manual for Clinical and Experimental Uses. Chicago, Illinois: Skoelting, 1978, pp. 1–32.
 
Stroop JR. Studies of interference in serial verbal reactions. J Experimental Psychol: General. 1935;18:643–662.
 
Golden C & Freshwater SM. The Stroop Color and Word Test: A Manual for Clinical and Experimental Uses. Wood
Dale, IL: Stoelting Co, 2002.
 
Chafetz MD, Matthews LH. A new interference score for the Stroop test. Arch Clin Neuropsychol. 2004;19(4):555–567.
 
Golden CJ. The measurement of creativity by the Stroop Color and Word Test. J Pers Assess. 1975;39(5):502–506.
 
Jensen AR. Scoring the Stroop test. Acta Psychol (Amst). 1965;24(5):398–408.
 
Koga H, Takashima Y, Murakawa R, Uchino A, Yuzuriha T, Yao H. Cognitive consequences of multiple lacunes and leukoaraiosis as vascular cognitive impairment in community-dwelling elderly individuals. J Stroke Cerebrovasc Dis. 2009;18(1):32–37.
 
Matser JT, Kessels AG, Lezak MD, Troost J. A dose-response relation of headers and concussions with cognitive impairment in professional soccer players. J Clin Exp Neuropsychol. 2001;23(6):770–774.
 
May CP, Hasher L. Synchrony effects in inhibitory control over thought and action. J Exp Psychol Hum Percept Perform. 1998;24(2):363–379.
 
Mitrushina MM, Boone KB, Razani J, D’Elia LF. Handbook of Normative Data for Neuropsychological Assessment (2nd ed.). New York: Oxford University Press, 2005.
 
Murphy CF, Gunning-Dixon FM, Hoptman MJ, Lim KO, Ardekani B, Shields JK, Hrabe J, Kanellopoulos D, Shanmugham BR, Alexopoulos GS. White-matter integrity predicts stroop performance in patients with geriatric depression. Biol Psychiatry. 2007;61(8):1007–1010.
 
Stout JC, Paulsen JS, Queller S, Solomon AC, Whitlock KB, Campbell JC, Carlozzi N, Duff K, Beglinger LJ, Langbehn DR, Johnson SA, Biglan KM, Aylward EH. Neurocognitive signs in prodromal Huntington disease. Neuropsychology. 2011;25(1):1–14.
 
Strauss E, Sherman EMS, Spreen O. A Compendium of Neuropsychological Tests: Administration, Norms, and Commentary, 3rd ed. New York: Oxford University Press, 2006.
 
Thomas M, Smith A. An Investigation into the Cognitive Deficits Associated with Chronic Fatigue Syndrome. Open Neurol J. 2009;3:13–23.
 
Wall SE, Williams WH, Cartwright-Hatton S, Kelly TP, Murray J, Murray M, Owen A, Turner M. Neuropsychological dysfunction following repeat concussions in jockeys. J Neurol Neurosurg Psychiatry. 2006;77(4):518–520.
Recommended Instrument for
HD, ME/CFS, MS, SRC and Stroke
Page 1 of 1