Josephson SA, Hills NK, & Johnston SC (2006). NIH Stroke Scale reliability in ratings from a large sample of clinicians.. Cerebrovascular Disorders,, 22(5-6), 389-395
Objective: The NIH Stroke Scale (NIHSS) is widely used in stroke clinical care and trials. Certification in its use, most commonly through rating of video vignettes, is routinely required. To investigate the reliability of the NIHSS in a representative sample of raters, we examined the results of the most frequently used certification examination. Methods: At the invitation of the National Stroke Association, we analyzed the results of all raters who completed one of two multiple patient videotaped certification examinations from 1998 to 2004. Total scores for each vignette were calculated and ratings were compared based on percentile of responses and modified kappa scores. Results: There were 7,405 unique raters with 38,148 individual NIHSS item responses; median scores for each vignette ranged from 0 to 31. Total NIHSS scores varied widely between raters; scoring for 7 of the 11 patients (64%) had a four or more point difference in NIHSS score from the 5th to 95th percentile. The aphasia (kappa = 0.60) and facial palsy (0.65) items on the test contributed most to the variance in the total NIHSS score. Nurses agreed with the most common response on scoring more frequently than physicians (p < 0.0001). Taking the certification examination multiple times did not improve agreement. Conclusions: In a large diverse sample of clinicians, inter-rater reliability for individual elements of the NIHSS on videotaped vignettes was generally good, but overall scoring was inconsistent and could impact clinical trial results. Whether additional training, modification of examination elements, or clearer definitions for scoring could improve reliability requires further study. Copyright (c) 2006 S. Karger AG, Basel.
PMID: 16888381 [PubMed - as supplied by publisher]
Anthony H. Risser | neuroscience | neuropsychology | brain