Measuring interrater reliability among multiple raters: an example of methods for nominal data.

Display options

Format

Share it on

Stat Med. 1990 Sep;9(9):1103-15. doi: 10.1002/sim.4780090917.

Measuring interrater reliability among multiple raters: an example of methods for nominal data.

Statistics in medicine

K L Posner, P D Sampson, R A Caplan, R J Ward, F W Cheney

Affiliations

Department of Anesthesiology, University of Washington, Seattle 98195.

PMID: 2244082 DOI: 10.1002/sim.4780090917

Abstract

This paper reviews and critiques various approaches to the measurement of reliability among multiple raters in the case of nominal data. We consider measurement of the overall reliability of a group of raters (using kappa-like statistics) as well as the reliability of individual raters with respect to a group. We introduce modifications of previously published estimators appropriate for measurement of reliability in the case of stratified sampling frames and we interpret these measures in view of standard errors computed using the jackknife. Analyses of a set of 48 anaesthesia case histories in which 42 anaesthesiologists independently rated the appropriateness of care on a nominal scale serve as an example.

Cited by

Interobserver reliability of radiographic assessment after radial head arthroplasty.

Bexkens R, Claessen FMAP, Kodde IF, Oh LS, Eygendaal D, van den Bekerom MPJ.
Shoulder Elbow. 2018 Apr;10(2):121-127. doi: 10.1177/1758573217719088. Epub 2017 Jul 10.
PMID: 29560038

Inter-observer variation in the diagnosis of coronal articular fracture lines in the lunate facet of the distal radius.

Wijffels MM, Guitton TG, Ring D.
Hand (N Y). 2012 Sep;7(3):271-5. doi: 10.1007/s11552-012-9421-5.
PMID: 23997731

The interobserver reliability of classification systems for radial head fractures: the Hotchkiss modification of the Mason classification and the AO classification systems.

Sheps DM, Kiefer KR, Boorman RS, Donaghy J, Lalani A, Walker R, Hildebrand KA.
Can J Surg. 2009 Aug;52(4):277-282.
PMID: 19680511

Cantonese-Speaking Children Do Not Acquire Tone Perception before Tone Production-A Perceptual and Acoustic Study of Three-Year-Olds' Monosyllabic Tones.

Wong P, Fu WM, Cheung EYL.
Front Psychol. 2017 Aug 29;8:1450. doi: 10.3389/fpsyg.2017.01450. eCollection 2017.
PMID: 28900404

Agreement between Initial Classification and Subsequent Reclassification of Fractures of the Distal Radius in a Prospective Cohort Study.

van Leerdam RH, Souer JS, Lindenhovius AL, Ring DC.
Hand (N Y). 2010 Mar;5(1):68-71. doi: 10.1007/s11552-009-9212-9. Epub 2009 Jul 09.
PMID: 19588208

Reliability and repeatability of tibial plateau fracture assessment with an injury mechanism-based concept.

Zhang BB, Sun H, Zhan Y, He QF, Zhu Y, Wang YK, Luo CF.
Bone Joint Res. 2019 Sep 03;8(8):357-366. doi: 10.1302/2046-3758.88.BJR-2018-0331.R1. eCollection 2019 Aug.
PMID: 31537993

Interobserver reliability of the classification of capitellar osteochondritis dissecans using magnetic resonance imaging.

Bexkens R, Simeone FJ, Eygendaal D, van den Bekerom MP, Oh LS.
Shoulder Elbow. 2020 Aug;12(4):284-293. doi: 10.1177/1758573218821151. Epub 2019 Jan 16.
PMID: 32782483

MeSH terms

Publication Types

LinkOut - more resources

Medical
- MedlinePlus Health Information

Measuring interrater reliability among multiple raters: an example of methods for nominal data.

Affiliations

Abstract

Similar articles

Cited by

MeSH terms

Publication Types

LinkOut - more resources