This blog is the latest in a series by Dennis Sherwood, who has been tracking the 2020 results fiasco for HEPI.
On Wednesday 10 June 2020, Dr Michelle Meadows, Ofqual’s Executive Director for Strategy, Risk and Research, and Sally Collier, at that time, but no longer, Ofqual’s Chief Regulator, appeared before the Education Select Committee, chaired by Robert Halfon MP. Given what we know now, the transcript makes interesting reading.
A few weeks later, on 11 July, the Select Committee published their report, Getting the grades they’ve earned. Covid-19: the cancellation of exams and ‘calculated’ grades – a title whose irony is evident only now that the ‘calculated’ grades have been scrapped.
Today, Wednesday 2 September, the Select Committee convenes again, with Dr Meadows present once more, this time accompanied by Ofqual’s Chair, Roger Taylor, Executive Director for General Qualifications, Julie Swan, and newly (re-)appointed Acting Chief Regulator, Dame Glenys Stacey.
Many people have been – and continue to be – damaged by this year’s catastrophe, and the Committee could well spend its time considering specific cases. Those are important, and of course critical to the individuals concerned. There are, however, many fundamental issues that still remain murky and unresolved. This morning’s meeting therefore provides an important opportunity to illuminate these dark corners, and so here are some key questions that I think demand answers.
1. Who, specifically, took the decision to design and build a (very complex) algorithm to predict every grade distribution for every subject in every school in the country, rather than, for example, use a (much simpler) method to ‘sense-check’ schools’ submissions? This was the fundamental error of judgement from which all else followed.
2. When was this decision taken? In Gavin Williamson’s statement of 20 March that Ofqual will ‘…produce a calculated grade for each student…’, does the explicit reference to ‘a calculated grade’ suggest that the decision to follow the ‘algorithmic’ approach had already been taken, before 20 March?
3. What other approaches were suggested, and why were they rejected? What evidence is there of the corresponding discussions?
4. Who, specifically, designed the algorithm? Who wrote the corresponding computer code? Was there any involvement of individuals not fully employed by Ofqual or the exam boards? In which case who, in what capacities, and at what cost?
Were schools misled?
5. Why were schools asked for Centre Assessment Grades (CAGs), when they were known, from the outset, to be superfluous? The answer ‘because CAGs were needed for “small” cohorts’ is interesting in that it undermines the use of the algorithm: if teachers’ judgements are acceptable for ‘small’ cohorts, why not for ‘large’ ones too?
6. Given that schools were asked for CAGs, why were they denied the opportunity, at the same time, to provide evidence for ‘outliers’ that did not fit historic patterns?
7. Why was Ofqual’s ‘Guidance’ not clearer, more specific and more honest about what was actually happening? Did Ofqual set out, intentionally, to mislead? In particular, why were schools placed in the impossible position of being asked to meet two mutually contradictory requirements simultaneously? They were: i) to give a ‘realistic judgement of the grade each student would have been most likely to get if they had taken their exam(s)’; and ii) to bear in mind that grades would be constrained by the undefined process of ‘statistical standardisation’, which would draw on ‘evidence including expected grade distributions at national level, results in previous years at individual centre level, [and] the prior attainment profile of students at centre level’ – this being widely interpreted as implying ‘no grade inflation’ (as indeed was proven by the A-level results as first awarded on 13 August, and the down-grading of nearly 40% of the A-level CAGs).
8. In short, were teachers set up to fail?
How matters evolved
9. On 15 June, FFT Education Datalab published a report, a comment on which includes a statement that ‘approximately 37%’ of GCSE grades might be down-graded. This precedes by some seven weeks the article published in the Guardian on 7 August revealing that ‘nearly 40% of A-level grades submitted by teachers are set to be down-graded’, and was the earliest warning of what was to come. Who at the Department for Education and at Ofqual saw this report? Why were the implications of this warning not heeded? Did no one in authority care that teachers’ judgements were to be discarded to such a great extent?
10. How many of the recommendations in the Select Committee’s report of 11 July 2020 were successfully actioned by Ofqual in an appropriate time? And if any were not, why not? What accountability should Ofqual therefore bear?
11. Why, specifically, were so many A-level and AS CAGs down-graded? How many GCSE CAGs would have been downgraded had the u-turn of 17 August not taken place? How many of all these were the result of ‘gaming’? How many were the result of ‘over-optimism’? How many were simply the result of the need to round fractions to whole numbers, or attributable to year-on-year variability, in relation to which Ofqual failed to give precise instructions?
Resolving the still-present injustice
12. Given the current unresolved injustices, what needs to happen now to allow those who followed the implied rule of ‘no grade inflation’ to appeal against CAGs that were ‘internally moderated’ downwards, against teachers’ better judgement?
13. Looking ahead to the return of exams, in whatever form, and whenever they might take place, can Ofqual please explain, precisely, what is meant by their statement of 11 August 2019 that ‘more than one grade could well be a legitimate reflection of a candidate’s performance’? What are the corresponding implications?
14. Since the same statement further confirms that ‘This is not new, the issue has existed as long as qualifications have been marked and graded’, why has the ‘issue’ of unreliable grades not been long since resolved? Especially since the fundamental problem is not ‘marking error’ but a failure of the policy used to determine grades from marks.
15. Since Ofqual’s own research shows that, over each of the last several years, on average, about 1 grade in every 4 awarded has been ‘wrong’, why does Ofqual’s recently published Corporate Plan 2020-21 not identify as a key action ensuring ‘all exam grades are reliable and trustworthy’?
16. Overall, and looking not just at this year’s fiasco but over the whole decade of Ofqual’s existence, to what extent has Ofqual successfully fulfilled its statutory obligation, as defined by Section 22 of the Education Act 2011 ‘to secure that regulated qualifications give a reliable indication of knowledge, skills and understanding…’?