This blog was kindly contributed by Dennis Sherwood, author of Missing the Mark: Why so many school exam grades are wrong, and how to get results we can trust, published by Canbury Press.
At a hearing of the Education Select Committee on 12 October 2022, Ofqual’s Chief Regulator, Dr Jo Saxton, stated that the fact that:
… very few grades change when marking is reviewed … is a key piece of evidence that exam grades are right …
This can be quantified by reference to Tables 1 and 2 in Ofqual’s statistics, published yesterday (15 December 2022), relating to the summer 2022 GCSE, AS and A Level exams in England: a total of 53,765 grades were changed, 0.9% of the 5,967,675 grades awarded.
Whether or not more than 50,000 grade changes are ‘very few’, or indeed few enough, is a matter of opinion; more interesting, to my mind, is Dr Saxton’s subsequent assertion that this ‘is a key piece of evidence that exam grades are right’. My understanding of these words is that Dr Saxton is saying that the fact that only 0.9% of grades were changed – and so must have been wrong when awarded – implies that the other 99.1% of grades must be right. How reassuring.
Dr Saxton is not alone in making claims of this type.
Another such claim was made in an interview on BBC Radio 4’s Today programme on GCSE results day, 2019, the last time before this year ‘real’ exams were held. When asked about the reliability of the grades awarded, Nick Gibb, then, as now, Minister for Schools, answered:
only about 1% of grades are changed on appeal.
Mr Gibb does not explicitly say ‘therefore the remaining 99% of grades are right’, but he surely hints at that, and quite possibly hopes that the programme’s listeners will reach that conclusion.
Pearson Edexcel, however, are not so coy – here is a quotation from their website:
Pearson Edexcel delivers the most accurate exam results in the UK with 99.2% of grades accurate on results day in 2017, meaning 99.2% of teachers and students can feel confident that they’ll get the right outcome.
All very convincing; all very impressive.
But are claims of the form, ‘only 1% of grades are changed, therefore the remaining 99% of grades are right’, true?
Alas, they are not.
Many reading this will already have spotted the flaw, so please forgive my spelling out the explanation here.
If a grade is changed as the result of a ‘marking review’ (to use Dr Saxton’s words) as the result of a ‘challenge’ (Ofqual’s preferred term), or as the result of an ‘appeal’ (the word that most people would, I think, actually use), then the originally-awarded grade must have been wrong. If it wasn’t, why change it? That’s important, for whether an originally-awarded grade is confirmed or changed as the result of a challenge is the only test of whether that grade was right or wrong, for this cannot be determined from the awarded grade alone.
For a grade to be confirmed, or indeed changed, it must have been challenged first. But if a grade is not challenged, that grade might be right, or it might be wrong – no one has looked, so no one knows.
According to Ofqual’s statistics, it is true that 53,765 of the summer 2022 grades, 0.9% of the total number of grades awarded, were changed. Ofqual’s statistics also show, however, that the total number of grades challenged was 233,710. Those 53,765 changed grades therefore relate only to the 233,710 grades that were challenged, implying that about 23% of challenges – that’s nearly 1 in 4 – resulted in a grade change.
A further, and important, number is not published by Ofqual, and can only be inferred. That number is 5,733,965 (5,967,675 – 233,710), the number of awarded grades that were not challenged.
Because those 5,733,965 grades were not challenged, there is no information as to whether or not any of them are right or wrong. It is possible that they are all right; it is also possible that they are all wrong. No one knows, for no one has looked.
Assertions of the type, ‘only 1% of grades are changed, therefore the remaining 99% of grades are right’, assume that all the unchallenged grades are right. Given that this past summer 5,733,965 grades were not challenged – 96.1% of the 5,967,675 grades awarded – you may take your own view on the likelihood that this assumption is valid. It might be, or perhaps all the 5,733,965 unchallenged grades would have been changed and were therefore wrong but remain undiscovered as wrong since they weren’t challenged.
The reality, of course, is that the 233,710 grades challenged represent just a sample of the total population – a sample that resulted in 53,765 grade changes, 23% of the grades challenged.
If this were a representative sample, it would suggest that it is quite possible that about 23% of all the awarded grades would be changed were they to be challenged, and so were wrong when they were announced, with about 77% being right – numbers very different from a claim of around 1% wrong, 99% right.
There are, of course, many reasons why grades that are challenged are not a representative sample of the whole population – a bias to scripts marked just below grade boundaries is very likely, as is a bias towards those who can afford to forfeit the fee to pay for a remark.
So that leaves the question, ‘how reliable are grades as awarded?’ somewhat in the air. What do you think? Is your intuition that the answer is, as Dr Saxton appears to claim, close to 1% wrong and 99% right? Or leaning towards 23% wrong and 77% right?
It turns out that Ofqual answered that question four years ago, in 2018.
In the mid-2010s, Ofqual commissioned a major research project in which entire cohorts of scripts in 14 subjects were marked twice, as if every script had been challenged. Importantly, the whole population was studied, not just a sample.
The overall answer?
25% wrong, 75% right.
On average, about one grade in every four awarded is wrong.
To make that real, of the 6 million grades awarded in August 2022, about 1.5 million were wrong, of which only a very small fraction were challenged, discovered and corrected. To me, this is a tragedy, a tragedy that does much damage.
And a tragedy made worse by false statements of the form, ‘the fact that few grades are changed is key evidence that grades are right’. Especially when the flaw in this ‘argument’ was recognised long ago in a blog, posted on Ofqual’s website on 29 September 2014, written by Ofqual’s then Chief Regulator, Glenys Stacey (now Dame Glenys), in which we read:
We know that qualification grade changes represent just 0.6 per cent of all GCSE, AS and A level certifications. On this evidence, we might conclude that a little over 1 in 200 exam papers contained a marking error or inconsistency, but that assumes that all unchallenged grades are as correct as those challenged.
But not just that.
Understanding and interpreting simple data – numeracy – is a fundamental life skill. Surely everyone who has been to school should spot the flaw.
For the average person not to spot it is, perhaps, understandable if regrettable.
But for Ofqual, the DfE and Pearson Edexcel to display such innumeracy – and in such a vital context – is, to my mind, a disgrace.
As long as those in authority continue to believe that ‘few grades change when marking is reviewed … is a key piece of evidence that exam grades are right’, then one grade in four will continue to be wrong, and the damage will continue.
Maybe you think that this doesn’t matter; maybe you just don’t care.
But if you do think it matters – if you do care – what can you, individually, you and your colleagues, collectively, and the higher education sector as a community, do to put pressure on those in authority to fix this?