Exam appeals – a contradiction that must be resolved

This blog was kindly contributed by Dennis Sherwood, who is on Twitter @noookophile.

Official consultations are important documents, if only to allow the issuer to claim that ‘we consulted widely’. Whether or not the issuer is influenced by any responses is another matter, and those responses are of course heavily influenced by the questions posed, and the way those questions are phrased. So I’m always intrigued by the way questions are asked, by questions that are not asked, and by the issues that are not on the table.

Taking the Department for Education consultation on post-qualification admissions as an example, one question that isn’t there is ‘To what extent do you consider that a necessary pre-requisite for any form of PQA is the delivery of fully reliable and trustworthy GCSE, AS and A level assessments?’ – which picks up on Ofqual’s acknowledgement that exam grades are currently ‘reliable one grade either way’. And Ofqual’s consultation on the process for awarding this summer’s exam-free grades fails, for example, to seek views on awarding a ‘standardised leaving certificate’ (an idea, option K, rejected in last year’s options study), and on the possibility that requiring teachers to discriminate between 7 A level grades and 10 GCSE grades might not be the most sensible approach, possibly causing many students awarded a grade B or 5 to ask ‘why not an A (or 6)?’ and appeal accordingly.

The process for appeals, however, is addressed in Ofqual’s consultation. Here is an extract from page 22, relating to marked exams:

In both cases – for exams and non-exam assessments – the original mark must not be changed unless a marking error has been made. For many assessments it is not possible to say what is a ‘right’ mark for a student’s work. This is because markers must exercise their academic judgement when giving the mark. It is often the case that 2 trained markers could give slightly different marks for the same answer and that both marks would be legitimate. In this case if the original mark given can be supported it should stand.

These words are not new, and have applied since 2016. But I was puzzled by them then, and I continue to be puzzled now, for I think they contain a fundamental contradiction.

The central theme is familiar to anyone who has ever marked a script: ‘It is often the case that 2 trained markers could give slightly different marks for the same answer and that both marks would be legitimate’. Yes. But an inference that might be drawn – and perhaps intended? – is ‘therefore the two marks are equivalent, so it doesn’t matter which is given’.

No. They are not equivalent. And it matters very much.

If those ‘2 different but legitimate marks’ are within the same grade width, then both result in the same grade, which is fine. But if they are on different sides of a grade boundary, each results in a different grade, one higher, one lower. That is not fine at all, for only one of those grades appears on the candidate’s certificate. And if that is the lower one, the consequences could be devastating – missing a university place, having to re-sit GCSE English or Maths, losing self-confidence, setting poor expectations for the next level…

Furthermore, the statement there is no ‘right’ mark implies that there is no ‘right’ grade. Both grades are equally ‘right’, yet only one appears on the certificate.

The contradiction is in the first sentence: ‘the original mark must not be changed unless a marking error has been made’.

My brain has just exploded.

If there is a possibility that the grade as ‘awarded’ on the certificate is lower than another equally ‘right’ grade, why is it inadmissible to appeal that possibility, thereby denying the candidate the benefit of the higher grade that is equally ‘right’? How can this be a feature of an exam system unfailingly described as ‘the fairest and most accurate way to measure a pupil’s attainment’? How does this fulfil the objective of serving ‘natural justice’, as sought by Ofqual’s Chief Regulator, Simon Lebus?

It might be thought relatively few ‘different and legitimate’ marks actually straddle grade boundaries. Far from it. Ofqual’s own research shows that, on average across all subjects and all levels, about 1 grade in every 4 would have been different had a different examiner marked the corresponding scripts – which, in real terms, is about 1.5 million ‘ambiguous’ grades ‘awarded’ every year. Not just that: there are wide variations by subject (Maths and Physics, for example, being considerably ‘better’ than English and History), and also by mark within subject.

The contradiction resulting from mapping multiple ‘different and legitimate’ marks onto cliff-edge grade boundaries has bedevilled the exam system for years, with huge numbers of students damaged as a result. And although the awarding process will be very different this summer, the contradiction remains (see pages 23 and 24).

I think the policy that appeals cannot recognise ‘legitimate’ differences in academic judgement is deeply flawed, yet, for marked exams, this is easy to fix by recognising the possibility of these differences in the assessment as originally awarded. To do that would benefit everybody: students, teachers, parents, carers, admissions officers and employers would have confidence that assessments are truly reliable and trustworthy, and reliable assessments are an absolute pre-requisite for any form of PQA to make any sense at all.

And although the school exam appeals process is of especial significance to candidates, schools, parents and carers, higher education is the primary ‘user’ community, and so higher education can, and should, exert a powerful influence on the school exam system so that it delivers what we all require: trustworthy and reliable assessments, and a fair and open process for appeals.

So whatever sector you might be in, if you think this policy is flawed too, and that this should be fixed, then there are still a few days left to respond to Ofqual’s consultation – indeed, early signs indicate that the appeals process is a significant concern for those who have responded so far. The consultation may be completed online, so now is the time to ensure that your voice is heard.

3 comments

MARK BROWN says:
27th January 2021 at 09:40
As a former A Level examiner for History, the ‘standardisation’ meetings every summer always revealed massive discrepancies in examiners’ assessments of the same work (even from the most experienced). The mitigation comes from the fact that with online marking, it is likely that each essay will be marked by a different examiner, thus ironing out the inconsistencies. At least candidates are no longer subject to the whims of one individual on a particluar paper.
Dennis Sherwood says:
27th January 2021 at 10:21
Hi Mark – thank you.
And you are absolutely right – if different examiners mark different questions on the same script, the likelihood of any systematic bias attributable to a single examiner is much reduced.
So here’s a question: suppose that same script is marked by a different ‘team’. Would the mark be exactly the same as that given by the original ‘team’?
This is all about the intrinsic variability of legitimate marks given by different examiners to there same question, rather than the systematic ‘softness’ or ‘hardness’ of any individual examiner.
‘Fuzziness’ is still present, and so the problems of the cliff-edge grade boundary remains.
One solution to this is to use artificial intelligence for all marking – so that all scripts in any one subject are marked by the same ‘examiner’, who doesn’t get tired or distracted, and who is using the same rules for all scripts. Fuzziness is eliminated, but there are other implications…
Another is to use unambiguous multiple choice exams, but that’s problematic too.
So my preference is to accept that fuzziness exists, and to recognise it in the policy that maps a (single) original mark onto the assessment that appears on a candidate’s certificate, as discussed here https://www.hepi.ac.uk/2019/07/16/students-will-be-given-more-than-1-5-million-wrong-gcse-as-and-a-level-grades-this-summer-here-are-some-potential-solutions-which-do-you-prefer/
Huy Duong says:
27th January 2021 at 10:37
The least Ofqual can do in appeal, which is quite easy, is to take the average of the two marks. As in 2020, it doesn’t seem that Ofqual tries hard to be fair to the individual.
It is a system designed to protect itself rather than students.

3 comments

Leave a Reply Cancel reply