Display options
Share it on

BMC Med Educ. 2014 Sep 26;14:204. doi: 10.1186/1472-6920-14-204.

Implementing statistical equating for MRCP(UK) Parts 1 and 2.

BMC medical education

I C McManus, Liliana Chis, Ray Fox, Derek Waller, Peter Tang

Affiliations

  1. UCL Medical School, University College London, Gower Street, London WC1E 6BT, UK. [email protected].

PMID: 25257070 PMCID: PMC4182791 DOI: 10.1186/1472-6920-14-204

Abstract

BACKGROUND: The MRCP(UK) exam, in 2008 and 2010, changed the standard-setting of its Part 1 and Part 2 examinations from a hybrid Angoff/Hofstee method to statistical equating using Item Response Theory, the reference group being UK graduates. The present paper considers the implementation of the change, the question of whether the pass rate increased amongst non-UK candidates, any possible role of Differential Item Functioning (DIF), and changes in examination predictive validity after the change.

METHODS: Analysis of data of MRCP(UK) Part 1 exam from 2003 to 2013 and Part 2 exam from 2005 to 2013.

RESULTS: Inspection suggested that Part 1 pass rates were stable after the introduction of statistical equating, but showed greater annual variation probably due to stronger candidates taking the examination earlier. Pass rates seemed to have increased in non-UK graduates after equating was introduced, but was not associated with any changes in DIF after statistical equating. Statistical modelling of the pass rates for non-UK graduates found that pass rates, in both Part 1 and Part 2, were increasing year on year, with the changes probably beginning before the introduction of equating. The predictive validity of Part 1 for Part 2 was higher with statistical equating than with the previous hybrid Angoff/Hofstee method, confirming the utility of IRT-based statistical equating.

CONCLUSIONS: Statistical equating was successfully introduced into the MRCP(UK) Part 1 and Part 2 written examinations, resulting in higher predictive validity than the previous Angoff/Hofstee standard setting. Concerns about an artefactual increase in pass rates for non-UK candidates after equating were shown not to be well-founded. Most likely the changes resulted from a genuine increase in candidate ability, albeit for reasons which remain unclear, coupled with a cognitive illusion giving the impression of a step-change immediately after equating began. Statistical equating provides a robust standard-setting method, with a better theoretical foundation than judgemental techniques such as Angoff, and is more straightforward and requires far less examiner time to provide a more valid result. The present study provides a detailed case study of introducing statistical equating, and issues which may need to be considered with its introduction.

References

  1. Adv Health Sci Educ Theory Pract. 2008 May;13(2):203-11 - PubMed
  2. Acad Med. 1996 Oct;71(10 Suppl):S112-20 - PubMed
  3. Ann Intern Med. 1991 Jan 1;114(1):33-5 - PubMed
  4. BMC Med. 2005 Jul 18;3:13 - PubMed
  5. J Educ Eval Health Prof. 2006;3:2 - PubMed
  6. Lancet. 1990 Feb 24;335(8687):443-5 - PubMed
  7. Med Educ. 2003 Aug;37(8):739-45 - PubMed
  8. Teach Learn Med. 2006 Winter;18(1):50-7 - PubMed
  9. Acad Med. 1991 Aug;66(8):429-33 - PubMed
  10. Br Med J. 1974 Apr 13;2(5910):99-107 - PubMed
  11. Acad Med. 2014 Aug;89(8):1157-62 - PubMed
  12. BMC Med. 2012 Jun 14;10:60 - PubMed
  13. J Appl Meas. 2005;6(3):342-54 - PubMed

MeSH terms

Publication Types