Exam grades without exams: A better way

19 February 2021
By Keith Geary

Keith Geary is a former headteacher of two comprehensive schools and is currently a school governor and examiner.

Ofqual’s 2020 proposals for exam grades without exams stretched to 68 pages and generated a summer of results chaos and uncertainty for students and higher education admissions. 2021’s proposals run to 46 pages and – according to the Social Mobility Commission – risk ending in a ‘bigger disaster’ and ‘catastrophic unfairness’ for young people. Former UCAS Chief Executive Mary Curnock Cook warns the ‘sheer volume of appeals might overwhelm the system’. Ofqual’s 2021 proposals have generated 103,000 consultation responses, compared with 12,000 in 2020. Almost 50,000 came from students – 25 times more than responded in 2020. A fifteen minute YouTube video showing a Physics teacher talking through a highlighted copy of the consultation document attracted almost 40,000 views.

That extraordinary number of consultation responses suggests that there is wide concern that the proposals are ill designed to secure fairness.

A Level students want to know their grades are merited by their work and that someone other than their own teachers has assessed its standard authoritatively against that of other candidates. Similarly, universities want to know the grades will be sufficiently robust and fair for the admissions process.

How do we ensure that the 2021 A Level grades are fair enough for both?

Everybody needs to understand how the process will work and see it is as fair as circumstances allow, so keep it simple.

Apply what I will call the ‘evidence-of-best-performance’ approach.

Start from students’ work – what they have actually been able to learn and do during their course.

Select a small evidence base from each student’s work – their personal best, work showing of what they are capable. If that base is small and flexibly defined, the unfairness arising from the unequal impact of COVID-related disruption on individual students’ learning is reduced.

Use expertise in schools and colleges to propose grades: teachers produce a rank order of candidates based on the evidence-of-best-performance selections, indicating grade boundaries and internally moderated where possible.

Use expertise at the exam boards to validate grades by confirming or amending: examiners award grades as in any normal year through sampling, based on the rank order – seeking to confirm grades, but also sufficiently rigorous to ensure comparable standards across centres. This slight shift of focus for examiners can be readily achieved by adapting their standardisation training, drawing materials in the usual way from work submitted by the current candidate cohort.

Thus a centre’s proposed grades would be quality-assured against established grade standards by experienced, specialist subject examiners who would – in normal times – assess candidates’ performance. This would reassure both students and higher education institutions that grades have been appropriately awarded.

What would ‘evidence-of-best-performance’ consist of?

It would be a small sample representing a range of the subject content. That range should be limited, clearly defined but flexible enough to accommodate different degrees of disruption to learning.

The sample would be drawn from a range of areas.

Work completed under exam conditions, using tasks taken from questions or papers provided by the board. It is important students know they are doing tasks other candidates are doing so allowing fair comparison across centres. (Questions used could be those prepared for the 2020 or 2021 exam papers. There is no need to prepare fresh tasks – just give teachers flexibility in choosing which to use.)

Completed non-exam assessment – this is often an independent study or investigation of some sort but varies by subject.

Other illustrative independent work certified by the teacher as the candidate’s own

The range – Ofqual’s ‘minimum proportion of overall subject content’ – should be set at a realistic level, but with allowance for special consideration requests for candidates whose COVID-related circumstances have prevented them reaching this minimum by the assessment deadline.

How would this work in practice?

I will illustrate this with A Level English Literature – a relatively subjective, therefore tricky subject, where two examiners can assess the same response, give different marks and neither be ‘wrong’.

1. The school or college determines an evidence-of-best-performance selection for each student:

Two essays across different papers from the exam board’s questions list, written under exam conditions. Students should have the opportunity to do more than two to remove the only-one-chance aspect of an exam – which particularly concerns 2021 students – and to ensure a choice when selecting the best two to submit.

One piece of non-examination assessment: this varies across exam boards; each would need to define this within common guidelines.

A ‘wildcard’ example of work if it is not possible to provide both of the above. This would be a representative piece of independent work, authenticated as such by the centre – for example, a mock paper.

2. The centre constructs its rank order of candidates with proposed grades and submits it to the board.

3. Using the rank order, the board identifies a sample of candidates (particularly those at grade boundaries) and requests their evidence-of-best-performance selections.

4. The evidence-of-best-performance samples are reviewed by the examiners who would normally have been marking the exam papers and non-examination assessment. The review question is simply: ‘Does this sample of the candidate’s work suggest that he or she is performing at the level usually required to achieve the proposed grade?’ One of two possible responses is required: ‘Confirmed’ or ‘Not confirmed: Performing at Grade X’.

5. This process would identify centres where the board needs to engage further and request further evidence.

In this way, an evidence-of-best-performance approach gives the greatest number of candidates the fairest chance to show their best no matter how much Coronavirus has disrupted their learning and preparation.

11 comments

Rob Cuthbert says:
19th February 2021 at 12:15
I was with you all the way to ‘centres produce rank order’. That was the most pernicious part of the 2020 experience, because it sets students against their own classmates. Instead of rank ordering, there should be moderation by external experts, probably at at least two levels, to promote nationwide comparability. Ample expertise is likely to be available, and local networks of schools could function as a first line of moderation in any case. I believe teacher professionalism plus rigorous moderation will lead to results as least as reliable as the 1-in-4-exam-grades-are-wrong system.
Reply
Dennis Sherwood says:
19th February 2021 at 14:05
Thank you – full of lively ideas, and I particularly like your suggestion that the exam board acts to verify that the teacher’s assessment is “reasonable”, rather than to over-ride.
One question, if I may, about appeals. If a student were unhappy, what are your thoughts about appeals? For example, might the appeal be against the teacher, the originator of the grade, or the exam board, who has confirmed that judgement as “reasonable”?
And on a related tack… for many months before exams were cancelled, the government repeatedly chanted “exams are the fairest way to assess student performance”. Any other way is therefore at worst, “unfair”; at best “less fair”.
Teacher assessment is one such “other way”, and teacher assessment – however conducted (and I hope many of your ideas will be implemented) – will be central to what happens this summer.
So has the government already undermined the process, by casting doubt on teacher judgement? And if so, what can be done to restore trust in teachers?
Reply
Keith Geary says:
20th February 2021 at 09:19
Thank you for these comments. Here are a few thoughts on points raised.
Reply
Keith Geary says:
20th February 2021 at 09:22
Rank orders:
Yes, a rank order can be pernicious (Lear’s attempts to establish one didn’t go too well), but it doesn’t have to be: it depends on how you view it and how you use it.
A rank order is a tool to help examiners—it’s only a starting point, enabling examiners to interrogate the evidence presented in matching demonstrated performance to standards. In normal times, a rank-order is established from the marking of an exam paper which is then tested by sampling and review before grade boundaries are drawn. Thus a rank order is used to facilitate moderation (as in the process I propose for 2021); this is a not-only-but-also, not an either/or. The higher the proportion of sampling is the more rigorous robust the process.
With a longer time scale I would be arguing that every candidate’s evidence-of-best-performance selection should be collected in and looked at by the board—I still think this would be best—but the consultation proposals seem to imply Oqual’s view is that the exam boards should do as little as possible—hence, for example, the pushing of appeals back on to schools and colleges—and time available and the lateness of these (still to come) decisions is a daily increasing issue.
With March upon us, I think we’re in a situation where we mustn’t let pursuit of the best (all candidates’ evidence-of-best-performance selections are taken in and reviewed/moderated by the exam boards at the first stage) derail the possibility of achieving something that’s at least better for students than what is currently proposed.
Reply
Keith Geary says:
20th February 2021 at 09:23
Local networks of schools could function as a first line of moderation:
Those of us old enough to remember local CSE moderations will know that this approach is fraught with difficulty and that even in those pre-GCSE days local school politics and competition played into the grade negotiations that went on, with schools conscious of their pecking order and how many Grade 1s they should have because that’s what they’d had in previous years. (Sound familiar?) A centre would often find that by the end of the ‘moderation’ the consistency of its assessments had actually been eroded by the moderation process because of the inconsistent level of expertise across the large number of teachers involved to say nothing of some teachers’ and schools’ effectiveness in ensuring that their grades stayed as they were!
I believe that what I have proposed combines teacher professionalism and rigorous moderation, and whilst it is essentially a quick fix in the difficult circumstances of 2021, it is also potentially a way forward for exam assessment in the future.
Reply
Keith Geary says:
20th February 2021 at 09:25
Appeals:
One of the most absurd aspects about the Ofqual proposals is the idea that appeals should be handled by the schools and colleges themselves. This feels like an attempt to push all responsibility for any repeat of 2020’s explosion of protest once results are announced on to teachers. (Alison Peacock, Chief Executive of the Chartered College of Teaching, has referred to the proposals in terms of the teaching profession being ‘fed to the lions’)
Responsibility for awarding grades lies with the exam boards and therefore the management of appeals against grades should be theirs, too, as in any normal year. The approach I have outlined allows this. The evidence of best performance selections are available for every candidate (either already submitted as part of a centre’s sample) or held at the centre so where an appeal is raised the evidence is there to be called in and reviewed in the normal way.
One of the exam boards suggested in its consultation response that teachers should tell students their grades before submission and enter into negotiations with them (they call it ‘ongoing discussion between a teacher and a student about the potential grade to be awarded’) as a way of avoiding appeals: ‘In our view, this in turn has the potential to significantly reduce the risk of appeals and gives the student some agency in the process’. Well, yes, it would and what a Pandora’s box of dispute and inequality it would open—I am reminded of Robert Halfon’s warning about ‘well-heeled and sharp-elbowed parents’ last summer!
Reply
Keith Geary says:
20th February 2021 at 09:27
Undermining trust and confidence in teacher assessment?
Well, I think the short answer to your question is ‘Yes’ and, of course, government has a well-established tradition in this, not least in its reform of examination assessment since 2010 and the noise around it. Now, of course, they need teacher assessment to be seen as rigorous and robust so as to get them out of a hole, though the Education Secretary’s statement that government is ‘going to put our trust in teachers rather than algorithms’ seems to many a strategy to set teachers up to take the blame for any repeat of 2020’s grades fiasco, when robust contingency plans for the summer 2021 exams (with and without exams) should have been–and could have been–in place by the October half-term as Robert Halfon quite rightly argued last August.
The place of teacher assessment within a national (and global) public exam system is too complex for a comment here, but I think Simon Lebus is quite interesting—and nuanced—on teacher assessment. His letter as Interim Chief Regulator to the Secretary of State merits close attention as much for its subtext as for what it says—it seems to me a coded warning about the obstacles and risks involved in what his political master is requesting, a pre-emptive I-told-you-so. In 2007 he made a speech about ‘Intelligent Regulation: Trust and Risk’ just as government’s intention to establish and independent exam regulator was announced. In 2009 he had this to say to the Children, Schools and Families Select Committee when asked whether there was room for greater teacher assessment in place of the formal examinations provided by exam organisations such as his own, Cambridge Assessment that you provide:
‘There is no question that there is room for greater teacher assessment. I think the difficulty, as ever, is the question of public trust. There have been various debates about coursework and the extent to which people are schooled in coursework so that they can do very well in it, and then how that compares to written qualifications. There is nothing educationally wrong with teacher assessment at all. The question is how ready people are to trust that. Also, just thinking from an international perspective, and looking at what has happened in qualifications over the last 10 years, we live in a global economy. People are increasingly mobile. Qualifications are a form of currency and a support for them in their mobility and their careers, and they need to be trusted. I think it is a case that where systems have very large elements of teacher assessment, degrees of trust tend to be slightly reduced.’
All still relevant.
Reply
Dennis Sherwood says:
20th February 2021 at 12:03
Hi Kevin
Wonderful! Thank you once more for these rich thoughts and insights.
Let me pose one mind-game, and ask one further question.
The mind-game is this: imagine a world without educational rank orders. What would this world look like? How would it operate? In what respects, if any, might this be ‘better’ than our current rank-order-obsessed process? And in what respects, if any, worse?
My question relates that most important property of the system, trust. This underpins everything, for in the absence of trust, everyone would appeal – which was in essence what happened last summer when an entire population of ‘expert second opinions’ enabled each student to compare their CAGs with the results of the algorithm. Your quotation from Simon Lebus that “The question is how ready people are to trust [teacher assessment]” is telling, for it implies that “people” trust the exam system more. But is that because most of the “people” don’t know just how unreliable exam grades are, that 1 grade in 4 is wrong? Especially since the appeal system was deliberately changed in 2016 to deter discovery?
Reply
Keith Geary says:
24th February 2021 at 09:33
Operationalising a Better Way:
Mary Curnock Cook has commented positively on Twitter about the approach I outline (‘Nice simple construct here for awarding grades this summer’) but suggests that ‘even this simplicity would be difficult to operationalise in such a huge system’.
Not so, I believe, certainly for A Level with many fewer candidates than GCSE. It uses the systems and personnel that already exist, simply adjusting their focus. Within centres, the extraordinary level of organisational efficiency and rigour that Examination Officers routinely demonstrate year in year out would certainly be equal to co-ordinating the in-centre aspects as well as ensuring that the exam boards receive what they need when they need it.
For GCSE, I acknowledge, scale poses greater challenges, but the two key enabling factors for ‘processing’ the usual number of candidates in each subject—that is, the number of teachers and the number of examiners—are the same…
Alternatively…now that schools and colleges are to reopen, you could reinstate exams in English and Maths (with a reserve series three weeks or so later for any student unable to take the May exam) and then use the evidence-of-best-performance approach for the other subjects, but rather than using it to sort candidates into the current 9 GCSE grades (as others have asked, why do we need so many grades?) allocate performance across three bands equivalent to grades 1-3, 4-6, 7-9. Exam boards’ review of evidence-of-best-performance selections would need to be adjusted to reflect this change, but the change to three bands would make the process easier.
Reply
Keith Geary says:
24th February 2021 at 09:34
Private candidates:
Elsewhere (again on Twitter) a parent has asked how this would work for home-schooled/private candidates. This isn’t a problem as it can be managed through the centre which agrees to enter the private candidate. In constructing the candidate’s evidence-of-best-performance selection, the exam board’s requirements would apply in the usual way. So, the candidate would do the ‘Work completed under exam conditions’ element of the selection at the centre (just as he or she would have done the normal examinations at the centre) and any non-exam (or ‘internal’) assessment would follow the established authentication process for private candidates as in a normal year for which the entering centre’s role in supervision, authentication and marking is laid down. (OCR’s Private Candidates webpage has a short and clear description.) The centre would then include the private candidate in its rank order on the basis of the evidence. (Where a private candidate’s selection needed to include what I describe as ‘Other illustrative independent work certified by the teacher as the candidate’s own’, I cannot imagine teachers at the centre not doing this as long as they have other evidence in the selection to compare.) This way private candidates would earn their grade in the same way as all other candidates.
Reply
Keith Geary says:
24th February 2021 at 09:36
Trust:
That’s a big question about trust, Dennis—too many facets to be adequately answered in a short comment. However, it is a fundamental question wherever one stands on the weighting-and-reliability-of-teacher-assessment-to-assessment-by-formal-exams continuum.
I would offer a few observations:
• In this area, as in any, trust is not an absolute concept—it’s a sliding one. (We may trust all four of the doctors at our GP surgery, but there will be one we trust more, which doesn’t mean we’re not willing to rely on (so trust) the judgement of the others.)
• The challenge for any large-scale, high-stakes assessment system, such as A Level, is to secure a workable level of trust, which—as you say—collapsed in 2020, an outcome which some of us felt was foreseeable as soon as the grading consultation was published. (The issue of whether the current exam grading system merits trust is obviously related but also separate.)
• In education since the early 80s we have seen trust eroded—in schools, in teachers, in exams—which reflects similar changes in our perception of other institutions and professions, for example, government or the law. The idea of taking things on trust now tends to be regarded as rather naïve folly. In many ways this is probably a good thing as is the habit of scrutiny which it has encouraged.
• We are now exhorted to value and demand ‘transparency’ as a sort of substitute for trust. It isn’t. (When someone says they’re being ‘transparent’, it’s usually a sign they’re not—they just want you to think they are!) Transparency may lead to trust, but it doesn’t guarantee it. This is particularly important in the fraught area of our young people’s public examinations. (We need only cast our minds back to the annual media coverage of the summer results in normal pre-COVID years.)
• Now that we’ve eroded trust in both teacher assessment and exams—partly through ‘transparency’—we need to start again.
• With exams—particularly A Levels which remain the primary gateway qualification for Higher Education—the challenge moving forward is to construct a new balance between teacher assessment and exam assessment which is, first, robust and rigorous and, second, sufficiently flexible in times of crisis (such as in 2020, this year and most likely 2022 as well, given the lost learning time the 2022 cohort will have experienced across there GCSE and A Level courses) to secure the trust of students and all other stakeholders. That means change in the management of both types of assessment, their relationship, the monitoring of standards and probably the grading process that all feed into…
Reply

11 comments

Leave a Reply Cancel reply