The consequences from this year’s Great Grading Disaster continue to rumble on. Many students still feel damaged as a result of the ‘internal moderation’ exercised by their schools before their ‘centre assessment grades’ were submitted, hence the petition that, although rejected by the Department for Education (DfE), will be debated in Parliament on Monday 12 October. That debate might also encompass the significantly increasing pressure to reform the much-broken school exam system, pressure being exerted by, for example, the ‘One Nation’ group of Tory MPs who are campaigning to discard GCSEs, and the broadly-based ‘Rethinking Assessment’ community who too have GCSE in their sights, citing the unreliability of exam grades as one of the key drivers of change.
Scotland has already taken that idea on board, at least in the short term, having announced that the summer 2021 National 5 exams (roughly equivalent to GCSE) will not be held, with grades being determined by teacher assessment and course-work. Meanwhile, at the time of writing, Ofqual and the DfE continue, as ever, to defend the status quo, confirming that GCSE, AS and A level exams will take place, but possibly a few weeks after their originally-scheduled times.
My personal view is that, over the coming months, different schools will suffer different degrees of virus-induced disruption, so ploughing up any educational level playing-field that might have existed – if it ever did. So my ‘Plan A’ would be to decide, now, to award grades for all exams based on teacher assessment, but done properly this time, as suggested, for example, as the second alternative here, so that students, teachers, parents, carers, employers, universities and colleges all know what to expect. And in my contingency drawer, I would have some exam papers, so ‘Plan B’ would be to hold exams only if everyone agrees, sometime around Easter 2021, that two conditions are satisfactorily fulfilled:
- The exams must be fair, in that all students must have had an equal opportunity for learning.
- The resulting grades must be fully trustworthy and reliable, in contrast to the current position, as described so eloquently, but obliquely, in the recent statement by Dame Glenys Stacey, Ofqual’s Interim Chief Regulator, that exam grades ‘are reliable to one grade either way’.
Indeed, the unreliable grade problem is back to bite all those currently taking the autumn 2020 AS and A level exams – the 22,020 students who, presumably, were disappointed in the grades they ended up with after this summer’s chaos, and hoping for something better.
The following table shows the number of candidates sitting each subject (column 2), and the corresponding measure, if known, for that subject’s average grade reliability (column 3), as shown in Table 12 on page 21 of Ofqual’s November 2018 report, Marking consistency metrics – An update:
Candidates for autumn 2020 AS and A level exams
Column 4 applies the reliability percentage (if known) to the subject cohort to estimate the likely number of ‘right’ grades; column 5, the estimate of the number of ‘wrong’ grades, is the difference between the subject cohort, and the number of ‘right’ grades.
Of the 19,210 candidates taking a subject for which the grade reliability is known, this analysis suggests that some 16,227 will be awarded the ‘right’ grade and 2,983, the ‘wrong’ one. That leaves a further 2,810 candidates taking subjects for which the reliability is unknown. As I have discussed elsewhere, the overall average reliability of exam grades is about 75%, so using that figure suggests a further 2,107 ‘right’ grades and 703 ‘wrong’ ones.
Accordingly, for the whole cohort of 22,020 candidates, the number of ‘right’ grades likely to be awarded is 18,334 (say, around 18,000), and the number of ‘wrong’ grades, 3,686 (say, around 3,500). The average grade reliability for this cohort is therefore about 18,334/22,020 = 83%, rather better than the overall average reliability of 75% – that’s because this autumn’s cohort contains a much greater proportion of the more reliable science subjects as compared to a ‘normal’ summer A level cohort.
There are two other important differences too. Typically, more than 750,000 students take AS and A levels, so this autumn’s cohort of 22,020 is very much smaller; secondly, those choosing to take this autumn’s exams are likely to be seeking up-grades as compared to their awards this summer. So no student who already has an A* will be sitting, but there are probably many with an A who hope for an A*, or a B hoping for an A. That suggests a clustering of the cohort around grade boundaries in general, and the B/A and A/A* boundaries in particular.
That gives Ofqual a headache. The basic assumptions that they have used to determine grade boundaries in the past – that the cohorts are large, and that there is a spread of marks across all abilities – both fail. That undermines the policy of ‘comparative outcomes’, for there is no suitable comparison. Furthermore, it challenges the concept of ‘norm referencing’, whereby grades are determined by reference to the performance of other candidates, with a given percentage (more or less) of the cohort being awarded each grade across the whole grade scale. But any move towards ‘criterion referencing’, in which each candidate is awarded the grade they are deemed to merit, even if all candidates are awarded A*s, not only cuts across years of policy, but also raises the spectre that different standards might be used by the different exam boards – suppose, for example, that [this board’s] Physics exam is ‘easier’ than [that one’s]? This possibility is a consequence of the decision for there to be a ‘competitive market’ in school exams, and is, to my mind, another reason for reform: if anyone can identify on what grounds the current exam boards ‘compete’, and why any such competition is a ‘good thing’, please post a comment!
Even if Ofqual can solve the problems of drawing the grade boundaries in fair places, and ensuring consistency of standards across the exam boards, the problem of the fundamental unreliability of grades remains. Perhaps Ofqual will pull something out of their hat – for example, to use only senior examiners for marking the scripts, or to review, very carefully, all scripts whose marks are at, or very close to, a grade boundary, as might be possible for this autumn’s smaller cohorts.
But perhaps not. In which case, maybe as many as 3,500 students will be ‘awarded’ the wrong grade when the results are announced on Thursday, 17 December. But they won’t know, for they will have no ‘second opinion’; nor will they be able to appeal. And that estimate of 3,500 could well be wrong too, for it is based on Ofqual’s measurements of grade reliability derived from ‘normal’ cohorts. But since Ofqual don’t routinely declare the reliability of the grades awarded, and are most unlikely to do so this December, no one will know what the right number of wrong grades is.
The unreliability of grades – now acknowledged by Ofqual – is a scandal. And to me, this problem must be fixed before any exams are re-instated. Perhaps this too will feature in the parliamentary debate scheduled for next Monday, 12 October.