Rob Cuthbert is Emeritus Professor of Higher Education Management at the University of the West of England and Managing Partner of the Practical Academics consultancy. He wrote the 2020 HEPI blog ‘A-levels 2020: What students and parents need to know’ (which has had more hits than any other HEPI blog ever). You can find Rob on Twitter @RobCuthbert.
Don’t forget to sign up for the free HEPI / UCAS webinar ‘In Conversation’ with Clare Marchant, Chief Executive of UCAS, which will take place tomorrow (Friday, 30th July 2021).
Government ministers tend to announce in a taken-for-granted way that exams are the fairest way to assess students, before saying that in the difficult circumstances of the pandemic the arrangements this year are the best that could be done. They said that last year too, but that didn’t age too well. Will things be better this year? Let’s look at how A-level exams and grading works.
In broad terms, examinations may be criterion-referenced or norm-referenced. Criterion-referenced exams (such as the driving test) assess achievement against a specified standard (does this person perform a range of driving tasks sufficiently well?) and there is no limit to the numbers who might pass or get top grades. Norm-referenced exams compare the achievements of a candidate with the achievements of a norm group or population, so there are limits on how many can get the best grades. For a time, A-levels were criterion-referenced, but steadily improving results led to political criticism of ‘declining standards’ and a change towards norm-referencing. Modular A-level curricula with more emphasis on coursework led to similar criticisms and Michael Gove as Secretary of State for Education initiated a much greater emphasis on a single terminal examination. This made the 2020 problems worse, because there was limited formal evidence to replace the cancelled examinations.
Although Ofqual assert that A-levels are neither criterion- nor norm-referenced, they are very close to the norm-referenced end of the spectrum. Any upward movement from the previous year’s distribution of grades tends to be described as ‘grade inflation’ rather than, say, the result of improved student performance or better teaching. Ofqual have a statutory duty to maintain public confidence in the integrity of the examination system and see ‘grade inflation’ as a threat to that confidence. The obsession with collective ‘grade inflation’ blinded the Government and Ofqual to the unfair individual consequences of the algorithm used in 2020 until its after-the-last-minute abandonment.
Examination performance can vary for all kinds of reasons, including variable student effort, health, nerves, the quality of teaching and much more. Examination systems normally involve not only marking but also one or more further checks, by a second marker or a moderation process of some kind – and large-scale examinations will have several levels of checks. There may be extenuating circumstances for individuals, but nevertheless most people continue to accept without question the statement that ‘exams are fairest’. Despite the exam experience and the stresses it brings, the public probably still feel confident that at least marks and grades are unbiased and somehow ‘fair’.
But perhaps not for much longer. Dennis Sherwood, a prominent campaigner for a fairer system, almost single-handedly forced Ofqual’s chief regulator Glenys Stacey to admit to the Education Select Committee in 2020 that the exam system is reliable only ‘to one grade either way’. Sherwood has shown that one-in-four grades are likely to be wrong. The fuzziness of grading is such that, even in mathematics, the least fuzzy of subjects, there can only be close to 100% confidence in A* and U grades and scripts marked in the centre of an intermediate grade. For those close to grade boundaries there is often at least a 50% chance of error. Students with offers of university places calibrated in terms of A-level grades can reasonably conclude that the process is much more of a lottery than they had imagined.
Even in non-pandemic times, it is thus questionable whether the current system of determining exam grades is a fair way of deciding on students’ achievements and their future opportunities.
The catastrophic mismanagement of the 2020 examinations has been exhaustively analysed, although we still await a satisfactory official review. Setting aside the incompetence and policy mistakes there were three key failures. First was the failure to comprehend the profound consequences of grading for every individual, rather than the collective impact of ‘grade inflation’. As Nick Hillman observed in 2016: ‘Most big political questions can only be answered by balancing the needs of individuals against society.’ Second was the way the algorithm abolished the national competition for grades, replacing it with an approach which, subject by subject, set students in competition with their immediate classmates for the limited number of good grades available based on the past history of school achievement. The third failure was the secrecy over the Ofqual algorithm, which meant that hundreds of schools and colleges did the best they could by trying to fit their Centre-Assessed Grades (CAGs) to their best guess of the algorithm, in various ways. Some may have favoured high-performing students, skewing grades to A*/A but inevitably penalising other students in their own school with lower grades to keep the school’s overall ‘grade inflation’ within the bounds they guessed would apply.
With the eventual adoption of unmoderated CAGs, any centres which recklessly inflated their students’ grades without regard to the algorithm’s likely effect would have come out as winners. However, the great majority of centres went to great lengths to model possible grade distributions and test them against the centre’s data for 2017-2019. The centres who were most careful to follow the national guidance, deliberately constraining their overall gradings to match their previous record, may have ended up penalising their students the most.
When more details of the algorithm emerged it was clear that other unfairness was baked into the original approach, for example in subjects with small cohorts, such as Music, where there were particularly large uplifts in grades. The switch to CAGs came too late for universities to make full adjustment to their admissions. Some students – who had first been denied by the algorithm but through CAGs achieved the grades they needed for their chosen university place – found, nevertheless, that their course was already full, even though the rising tide of CAG acceptance and universities’ flexibility managed to refloat some boats.
In 2020 there was, therefore, unfairness between students, within schools, between subjects and between schools – and perhaps between the 2020 cohort and its predecessors and successors. Much of the unfairness has never been addressed, which is why in some quarters there is continuing anger and threat of litigation.
For 2021, the process has changed to use Teacher-Assessed Grades (TAGs), broadly moving towards (but different from) the approach eventually adopted in 2020, as described in publications from the Department for Education and Ofqual. Schools must assess only the areas of the curriculum which students have been able to study, must use a specified range of student work in specified ways and be confident that work produced is the student’s own. DfE say schools must ensure that: ‘the student has not been given inappropriate levels of support … Exam boards will investigate instances where it appears that evidence is not authentic.’ Exam boards reviewed schools’ and colleges’ quality assurance processes before the centres submitted grades, which had to be done by 18 June. The boards checked the evidence for a sample of student grades in a sample of subjects, in a sample of schools and colleges during June and July. After those checks the exam boards chose to visit (virtually) some schools and colleges and to review evidence for some students. Exam boards told centres in June which subjects and students had been selected. The process was explained to students in a useful government summary. A blog by independent academics who had been asked by Ofqual to review the 2020 student experience noted the extreme diversity of individuals’ learning experiences during the pandemic, and said:
Ofqual has gone on record stating that it is impossible to compensate fairly for the level of diversity of experience in terms of the grades that will be issued in the summer of 2021.Emphasis added
This is understandable given the huge variety of approaches in different schools. Research published today by the Sutton Trust shows ‘a big variation in the number of assessments being taken by A-level students to determine their grades.’
Exam boards decide whether the centre’s grades ‘are a reasonable exercise of academic judgement of the students’ demonstrated performance. If not, the boards will ask the school or college to investigate, but boards will not in general remark the student’s evidence or give an alternative grade. ‘Grades would only be changed by the board if they are not satisfied with the outcome of an investigation or malpractice is found.’ The crucial test is spelt out by Ofqual:
Exam boards will compare a centre’s results and select centres where the proportion of grades in 2021 appears significantly higher or lower than results in previous years when exams took place – 2017, 2018 and 2019. Results for individual subjects, especially those with small cohorts, can vary more from one year to the next. So the comparison for a centre will be made at qualification level – for all GCSE subjects combined and all A level subjects combined. This doesn’t mean centres must award grades to closely match those in previous years, or that the information from previous years should be used to suppress results. That’s not the case. There can be good reasons for results to vary from one year to the next, and centres should record the reasons for any substantial variances, in line with the centre’s policy. Exam boards will prioritise for quality assurance checks those centres where results are more out of line with their historical results than other centres, including where grades are lower.Emphasis added
Exam boards will drill down for more evidence where they are not satisfied and may ask centres to reconsider their grades. If a board disagrees with a centre’s grades it may withhold results. The 2021 approach alleviates to some extent the 2020 problem of very fine-grained subject-by-subject competition within each school for grades, and may allow more scope for going beyond the bounds of 2017-2019 achievement, but still means that students this year are competing for grades with their own schoolmates. Dennis Sherwood predicts that grades will more closely approximate to UCAS predictions, implying a further substantial increment of ‘grade inflation’. He says:
This year’s process has set student and parent against school and teacher, with exam boards, Ofqual and the Department for Education as bystanders. And with Ofsted, the inspector of schools, nowhere in sight.
It is worth noting that most schools in effect already know their students’ grades but are not allowed to tell the students until results day on 10 August. The approach described by one large comprehensive school is likely to be typical of the meticulous care given by centres to make the process as fair as possible. The Sutton Trust research notes that:
According to polling of 3,221 teachers by Teacher Tapp, 23 per cent of teachers at private schools and 17 per cent at state schools in affluent areas say that parents had approached or pressured them over their child’s exam grades this year. The same was true of just 11 per cent of teachers at state schools in poorer areas.
Such pressuring of teachers may be unlikely to succeed in most cases, but regardless of the outcome the longer-term effects may, as Sherwood argues, be to diminish trust in teachers and schools.
Before a grade is submitted, teachers should make students aware of the evidence they are using to assess them. Students will then have the opportunity to confirm the evidence is their own work and make their teachers aware of any mitigating circumstances they believe should be taken into account.
Ofqual have also properly been concerned to avoid or minimise any systemic bias in teacher assessment and commissioned a review of 2020 which concluded:
We found evidence that:
• gender bias was mixed – but a slight bias in favour of girls (or against boys) was a common finding
• ethnicity bias was mixed – there were findings of bias against as well as in favour of each minority group (relative to the majority group) as well as findings of no bias
• disadvantage bias was less mixed – bias against the more disadvantaged (or in favour of the less disadvantaged) was a common finding
• SEN bias was less mixed – bias against pupils with special educational needs (or in favour of those without) was a common finding
Early concerns about the 2021 process – whether sampling will be reliable, how partial curriculum coverage will be handled, and more – expressed by one of the prominent expert critics in 2020, George Constantinides of Imperial College, may still be relevant.
Students will receive A/AS-level results on 10 August, a week earlier than usual. Results for vocational and training qualifications linked to progression to further or higher education will also be issued to students on or before these dates. There is then:
a window for students who believe their grade is wrong to raise an appeal. Exam boards will support schools and colleges in prioritising appeals where their outcome will determine a student’s ability to progress to their next stage of education or training.
The deadline for appeals to the exam board is 23 August if a student has applied to university and not achieved their first choice. In other cases it is 17 September. The appeal process is better than in 2020, but that was a low threshold.
You can only appeal if you think your school or college either:
• did not make a reasonable judgement when deciding which evidence to use to determine your teacher-assessed grade
• did not make a reasonable judgement about your grade based on the evidence gathered
• didn’t follow its procedures properly when working out your proposed grade
• made an administrative error when submitting your proposed grade
You can’t appeal until you have received your results on results day.
If a student wishes to appeal, the centre will first check it has followed all processes correctly. If there is an error the centre may submit a revised grade. If not, and the student still wants to appeal, they ask their school or college to submit a formal appeal to the exam board for them. Ofqual explains:
The exam board will check the centre followed its own processes and exam board requirements as well as reviewing the evidence used to form their judgement and providing a view as to whether the grade awarded was a reasonable exercise of academic judgement. If the exam board finds the grade is not reasonable, they will determine the alternative grade and inform the centre. In cases of disagreement between the centre and the exam board, or if the student disagrees with the centre or the exam board, the case can be referred to Ofqual’s Exams Procedure Review Service (EPRS). The exam board’s decision on the grade following appeal will stand unless the EPRS finds that the exam board has made a procedural error. Appeals are not likely to lead to adjustments in grades where the original grade is a reasonable exercise of academic judgement supported by the evidence. Grades can go up or down as the result of an appeal.Emphasis added
Dennis Sherwood has tracked the evolution of the Ofqual-prescribed appeals process, noting that:
It is an expert, trusted, independent, unbiased second opinion that an appellant seeks. And it was expert second opinions that Ofqual used to determine the reliabilities of GCSE, AS and A level grades, as presented in their two landmark reports, ‘Marking Consistency Metrics’, of November 2016, and November 2018’s ‘Marking Consistency Metrics – An update’, Figure 12 of which shows measurements of the reliabilities of the grades for each of 14 subjects – the key evidence that, on average across all subjects, about ‘1 exam grade in every 4, as actually awarded, is wrong’. But in the summer of 2016, just a few months before the first Marking Consistency Metrics report was published, Ofqual announced a change in the rules for appeals, denying access to an expert second opinion except in very limited circumstances. In my view, the consequences of that change have been pernicious when there were exams, more pernicious last summer when there weren’t, and I expect even more so this coming summer since so much more is based on local judgement.
If you are dissatisfied with the grades you receive this year, what should you do?
Appeal: this year it is free and individual appeals are allowed. But the process is still narrow and stacked against appellants, because it depends on teachers or schools being prepared to admit they made mistakes in procedure or academic judgment. So take particular note of that Ofqual statement that ‘Appeals are not likely to lead to adjustments in grades where the original grade is a reasonable exercise of academic judgement’. Academic judgment leads to the ‘normal’ situation where one grade in four is wrong, because results are only ‘reliable to one grade either way’. A student with results ABB but needing AAB for university entry may well find that they are denied their place because many universities are operating at the very margins of capacity. But Ofqual admits that ABB could, even within the bounds of ‘reasonable academic judgment’, be anywhere from A*AA to BCC. In these circumstances, will an appeal fail? Very probably it will, but is the appeal process robust enough to withstand a subsequent legal challenge? Perhaps not.
The courts generally avoid entering the realm of challenging reasonable academic judgment, but where the fuzziness of grades is admitted and the consequences of one grade change are so significant, judicial review may become a real possibility. However, that is very unlikely to be sufficiently fair to the disappointed students who seek a university place in 2021, because any legal proceedings will stretch well into the 2021-2022 academic year. Government and Ofqual must already be wary of such legal challenges and it is hard to see what would constitute a decent defence. Even if there were such a defence, Ofqual runs the risk of failing to maintain public confidence in the integrity of the overall process.
To summarise, before 2020 the examination system on average delivered one wrong grade in every four. In 2020 there were new and different kinds of unfairness. This year things have improved, but Ofqual admits it will be impossible to compensate fairly in grades for the variety of experiences in different schools. The students taking A-levels in 2022 have already had major interruptions to their studies, and the cohorts beyond 2022 may also be affected.
A-levels aren’t fair and they never were, but will they be fairer in future? Dennis Sherwood argues that even the best marking may still result in unreliable grades, so:
The solution to the problem of grade unreliability is therefore not to be found in marking, but in changing the policy defining how the grade is determined from the original mark.
Jo Saxton, the new Ofqual chief regulator, was an adviser to Gavin Williamson before her appointment. At her ‘confirmation’ hearing with the Select Committee she seemed surprisingly unprepared for questions about the unreliability of grades, confusing the quality of marking with grading, so the signs are not good. We can only hope that she can keep her eventual promise to the Education Select Committee on 6 July that:
we should leave no stone unturned in ensuring that the public have confidence in the grades that young people are awarded.
Don’t forget to sign up for the free HEPI / UCAS webinar ‘In Conversation’ with Clare Marchant, Chief Executive of UCAS, which will take place on Friday, 30 July.