This new blog has been written by Dennis Sherwood, who has been tracking the story of this year’s school exams for HEPI.
Friday, 12 June was the deadline for schools to submit their ‘centre assessment grades’ and rank orders for this year’s GCSE, AS and A-level students, and over the next several weeks those grades will be checked against a statistical algorithm to determine whether or not each submission meets certain criteria. If the result is ‘yes’, the school’s centre assessment grades will be confirmed. But if not, the exam board will over-rule the school’s submission, and determine whatever grades they think fit.
On Monday, 15 June, FFT Education Datalab posted a most informative blog presenting some findings from their recent service in which they checked schools’ draft GCSE grades against corresponding actual grades as awarded in 2019. As FFT point out, their study was based on draft grades, not the grades actually sent to the boards, and compared only against a single year, 2019. But even with those caveats in mind, the results are dramatic. Drawing on data from over 1,900 schools (that’s over half of the secondary schools in England), every GCSE subject they studied was overbid.
If this study is indicative of what is happening at the boards, then Ofqual has a choice. Either to ‘let it be’, on the grounds that ‘this year is special’. Or to enforce the ‘no grade inflation’ policy, so causing the boards to intervene, unilaterally and without consultation, to throw the schools’ grades away and place grade boundaries wherever they wish.
I don’t know what will happen. My hunch is ‘no grade inflation’. Looking back over the decades, the successive year-on-year increase in top grades was trumpeted as ‘proof’ of better teaching, better students, better education policies. Until about 2010, when ‘no grade inflation’ became the mantra, as first implemented in 2012 and maintained ever since. At that time, the Secretary of State for Education was Michael Gove. And his special adviser was Dominic Cummings. Enough said.
To my mind, the over-ruling of many centre assessment grades (that’s more than 25%) would be a great disappointment, and a measure of the failure of this year’s process; a process that, initially, held so much promise. I fear that the outcome will be a pretext for the powerful to say, ‘You had your chance, teachers, to show you could be trusted. And you blew it.’ The authoritarian pendulum will swing to the far right.
And, I fear, the blame will fall on the teachers.
One reason for this is that teachers have been placed right in the centre of the firing line by statements of the form ‘this year, students’ grades will be based on teacher assessments’, which have appeared widely in the press, and on the media too: over the last few weeks, I have heard those words on news bulletins on both the BBC and Channel 4. A reader, listener or viewer might therefore reasonably infer that an individual student’s teacher, perhaps in consultation with colleagues at the same school, has the last word on each student’s grade. This in turn has stimulated an important discussion on teacher bias.
The truth, however, is that the grades awarded to students in August will not be those submitted by teachers. Rather, they will be the grades resulting from ‘statistical standardisation’ by which each exam board will ‘make sure that grades are fair between schools and colleges’. The grades to be awarded are those of the exam boards, not the teachers. This truth, however, is not widely known.
And a second reason is that teachers have been working, if not the dark, then at best in very poor light.
In the press release for Gavin Williamson’s announcement on 20 March confirming this year’s exams would be cancelled, we read:
Ofqual will develop and set out a process that will provide a calculated grade to each student which reflects their performance as fairly as possible, and will work with the exam boards to ensure this is consistently applied for all students. The exam boards will be asking teachers, who know their students well, to submit their judgement about the grade that they believe the student would have received if exams had gone ahead.
This clearly states that teachers will be asked to submit their assessments, and that there will be a centrally-administered process, to be ‘set out’, that will ensure a uniform standard across the country.
The press release also contains a direct quotation from the Minister:
I have asked exam boards to work closely with the teachers who know their pupils best to ensure their hard work and dedication is rewarded and fairly recognised.
I read ‘work closely with’ as implying consultation, co-operation, and listening, so when I wrote my blog the following day, I was optimistic. Yes, asking teachers for their assessments of students, and working closely with them, is showing trust in teachers. Great.
But as the weeks have passed, and as the process has become somewhat clearer – but only somewhat – my initial optimism has been tempered.
My original hope, stimulated by the quotations just cited, was that there would be a dialogue between the schools and the boards, after the schools have submitted their assessments. Schools would therefore have the opportunity to explain why ‘Isaac’, an especially talented student, really does merit an A* in Physics, even though the school has never achieved a grade higher than B in the past.
That hope was dashed when Ofqual published their consultation document on 15 April. I was disappointed; but I can understand that an explicit statement that such a dialogue will take place opens the door to ‘optimists’, and would require much time, effort and wisdom to distinguish between the legitimate and the fraudulent. The absence of this dialogue is unfair to Isaac, and an indictment of trust and integrity; wounds that will need to be healed. But Ofqual’s decision does make some form of pragmatic sense.
What makes no sense to me at all has been the failure of Ofqual and the exam boards (and the SQA too) to ‘set out’ (to quote the press release of 20 March) the full details of how, precisely, ‘statistical standardisation’ will work.
Yes, Ofqual have made statements such as:
The standardisation model will draw on … historical outcomes for each centre … and will consider, for A level, historical data from 2017, 2018 and 2019 … and for GCSE, data from 2018 and 2019 except where there is only a single year of data from the reformed specifications.
That’s helpful, for it rules out results before 2017 for A level, and rules in only GCSEs graded 9, 8, 7….
Those words ‘draw on’ and ‘consider’, though, are vague. Yes, they do imply that a school’s results, as actually achieved in any subject in the past, will be used to determine a benchmark against which this year’s submission will be compared. But how, precisely?
If I were a Head, I would wish to ensure that the grades my school submits are as close to the exam board’s benchmark as possible, so (presumably) increasing the likelihood that they will be confirmed, rather than over-ruled. To comply with the benchmark, I need to know how the benchmark is computed; I need to know if some clever statistics will be in place so that outliers such as Isaac will be acknowledged. But if I’m not told the rules, I can only guess.
For example, for A-level, I can ‘draw on’ and ‘consider’ the results of the last three years by calculating an average, weighting the results of each of the last three years equally. An alternative, still ‘drawing on’ and ‘considering’ the results of the last three years, is to weight the best year more heavily, and the worst year more lightly, resulting in a more favourable benchmark. Is one acceptable, and the other not? How do I know?
A further ambiguity concerns the year-on-year variability. Suppose that in each of the previous years, my cohort has been 100 students, of whom 22, 26 and 18 were awarded grade A. The (equally weighted) average for this year is 22. But since the number of awards has varied from 18 to 26, any submission in that range is feasible, if not reasonable, and could well comply with Gavin Williamson’s statement that ‘hard work and dedication’ will be ‘rewarded and fairly recognised’. If 24 students are submitted for grade A, and if the exam board’s statistical standardisation determines a benchmark of 22, then the last two students in the rank order will be down-graded unilaterally and without consultation. Even worse: if, in good faith, every school ‘bids up a bit’, then grade inflation is blown sky high, forcing the boards to intervene. That’s exactly what FFT Education Datalab’s findings suggest has actually happened – as is inevitable if schools do not know what the benchmark is, how closely they must comply with it, and that even ‘modest optimism’ is likely to be penalised.
I’m not arguing against setting the benchmark at the simple average 22. What I am arguing is that I think it would have been very helpful to everyone if the rules had been ‘set out’ in full. That way, every school would have been able to replicate the process in advance. They would therefore know, before the grades were submitted, whether those grades are compliant or not, and so have some awareness of the likelihood of their grades being confirmed or over-ruled. A school could still submit non-compliant grades if it wished, but it would be doing so knowingly.
Likewise, students and parents would know exactly how the process is being conducted, and be absolutely clear that the grades, as awarded, will be determined not by the teachers, but by the exam boards.
What has actually happened, however, is to me unsatisfactory, and could lead to trouble. Given that teachers are being required, in essence, to second-guess the answer-the-exam-boards-have-thought-of-first, have they been set up to fail? Are teachers at risk of being piggies-in-the-middle, unfairly blamed for grades that they did not recommend? When teacher grades will be seen to have been over-ruled downwards – as the FFT Education Datalab results suggest will happen – will students be disappointed that their teachers’ assessments have been ignored? Will parents become angry at the limited, and highly technical, grounds for appeal?
If the boards already ‘know the answer’, as I suspect they do, then surely it would have been both more honest, and far simpler, for each board to have written to each school saying ‘our statistical algorithm has determined that, for your cohort of 53 students for 2020 A level Geography, you are allocated 4 A*s, 10 As, 15 Bs… Please enter in each grade box the names of the students to be awarded each grade, ensuring that no grade exceeds its allocation’ (that idea, by the way, is not my own, but emerged during a conversation with Rob Cuthbert, editor of SRHE News and Blog).
That would have been much easier for the centres to do: no worrying about the grades, no agonising over the rank order for all the students. And focusing attention on the right place – on deciding which students are to be on which side of each grade boundary, as fairly as humanly possible.
I still believe that this year’s results will be more fair than hitherto, for, fundamentally, the rank orders determined by teachers are, in my opinion, more reliable than the lottery of exam marks. But I also believe that the whole process would have been far better had the rules been published, and even better still if there had been some opportunity for schools to justify their Isaacs.
But if FFT Education Datalab’s findings are indeed a sign of things-to-come, those things could be quite nasty, with teachers, totally unfairly, taking the fall.
Oh dear. What a missed opportunity.