Hindsight is a wonderful thing: Ofqual, gradings and appeals by Dennis Sherwood

23 July 2020
By Dennis Sherwood

This blog was kindly contributed by Dennis Sherwood, who has been tracking the goings on at Ofqual in relation to this year’s public exam results for HEPI.

Whenever I hear that cliché, I shudder: its purpose – especially when used by a politician – is to deflect attention from a previous woeful decision, to avoid the blame.

Hindsight might indeed be wonderful. But it’s foresight that’s important.

For foresight is demonstrated when a decision has to be taken, when a choice has to be made. Choices have consequences, and it is the job of the decision-maker to make the very best choice possible. And that’s the choice that passes the most stringent test there is, the test of time. When we look back, we must all agree, ‘Yes, given the information then available, that was a good decision’. That is the only way we can judge the wisdom – or otherwise – of decision-makers. And, if necessary, hold them to account.

Choices have consequences, and it is the job of the decision-maker to make the very best choice possible.

As I write this, the IB [International Baccalaurate] is in turmoil, and – following the submission by schools of their ‘centre assessment grades’ and the corresponding student rank orders – Ofqual and the exam boards are in the process of using their ‘statistical standardisation’ algorithm to ‘make sure that grades are fair between schools and colleges’.

The details of precisely how this algorithm works have remained obscure, and so, in a recent report, the Education Select Committee recommended that:

Ofqual must be completely transparent about its standardisation model and publish the model immediately to allow time for scrutiny. In addition, Ofqual must publish an explanatory memorandum on decisions and assumptions made during the model’s development. This should include clearly setting out how it has ensured fairness for schools without 3 years of historic data, and for settings with small, variable cohorts.

Ofqual, however, subsequently refused, but in a recent symposium, a perhaps rather dim light has been thrown into this murky corner. The slide pack from this symposium has been published, and, to me, the slides contain some good news (small cohorts are explicitly recognised as requiring special treatment), some bad news (the appeals process is still very technical and narrow), and the sad, but perhaps unsurprising, news that the ‘”vast majority” of schools gave “optimistic” GCSE and A-level grades that would have meant unprecedented rise in results’.

A ‘Tragedy of the Commons’ has indeed happened, and there is a danger that teachers will be discredited.

Given that the algorithm only needs schools’ rank orders, why were they required at all? That’s just one question about this year’s process…

It now seems that many centre assessment grades will be binned. Given that the algorithm only needs schools’ rank orders, why were they required at all? That’s just one question about this year’s process…

Two alternative histories

So here are two alternative histories about how grading GSCEs and A levels might have been, had Ofqual taken a different decision after Gavin Williamson’s statement of 20 March.

I believe the two choices I’m about to describe were open to Ofqual at that time, and draw only on information then available. So this is not about hindsight. It’s about foresight.

1. ‘No grade inflation’

Ofqual explicitly declares ‘no grade inflation’ to be the overarching policy, as achieved by constraining each school to its historical grade pattern. The exam boards know this pattern for every subject and every school; they also know how many candidates each school has entered for the 2020 exams. The exam boards can therefore use whatever algorithm they wish to calculate how many grades each school is ‘allowed’ for each subject (which could well be what the current standardisation algorithm is actually doing anyway).

The boards then send a form to each school saying ‘For this year’s [23] candidates in [GCSE Geography], your school is allowed [so many] 9s, [this number of] 8s… Please enter the names of the candidates to be awarded each grade.’

Schools are not being asked for centre assessment grades or rank orders. All they have to do is put the right number of names in the right boxes, paying particular attention to which side of each grade boundary any name is placed. And surely focusing on the grade boundaries, and being as fair as humanly possible across them, is by far the most important aspect of all this to think – and indeed worry – about.

A board might also allow a school to exceed a grade allocation, provided there is robust evidence. This puts ‘no grade inflation’ at risk, so it all depends on how much ‘wriggle room’ Ofqual might allow.

2. ‘Trust the teachers’

Ofqual says ‘we trust’, and asks for grades based on teacher judgement. Since everyone understands that awarding a million A*s and five million 9s is daft, Ofqual also says, ‘please avoid grade inflation, and behave with integrity’.

Ofqual sends every school a spreadsheet, or provides an online facility, to calculate grades based on history, and dealing with arithmetical details such as averaging, rounding, and year-on-year variability. Since all schools do the computations in exactly the same way, making exactly the same assumptions, the numerical playing field is truly level. The results of the spreadsheet can then be adjusted, as each school wishes, to recognise individuals and also contexts that do not conform to historical averaging.

The validity of these ‘adjustments’ depends critically on the integrity of the teachers who make them. Accordingly, neighbouring schools mutually agree to act as external examiners, so helping temper over-optimism and eliminate bias. And with their wider perspectives, bodies such as ASCL, NAHT, HMC, the Sixth Form Colleges Association and the unions help organise the ‘mutuals’, review pre-submissions, ensure consistency of standards, test outliers, and, most importantly, suppress ‘gaming’. By working together in this way, a ‘Tragedy of the Commons’ is averted, resulting in a powerful demonstration that self-regulation works, and that teachers can indeed be trusted.

The schools then submit their grades, which the boards check, going back to the schools for explanations of outliers. Since these have been pre-vetted, there are not that many, and those that are present are well-evidenced.

On aggregating the results, Ofqual expects, and allows, some modest grade inflation. Adherence to an arbitrary statistical rule does not sacrifice fairness to each individual candidate.

The fair appeals safety net

Both alternatives have an appeals process like Scotland’s: free, and based on ‘further, evidence-based consideration of grades if schools and colleges do not think awarded grades fairly reflect learner performance’, with students being able to seek support from their schools if they feel they have been awarded an unfair grade.

Water under the bridge?

Those are just two possible alternative histories. There may be others. May I invite the imaginative to post comments accordingly!

And as I’ve said, all this was, I believe, totally foreseeable on March 20; nor is anything ‘rocket science’. Were these possibilities examined, and rejected? If so, given what has so far happened, was that rejection wise?

I don’t know how the choice to follow the process that is currently taking place was taken. I wasn’t around the table. And of course my descriptions are superficial, lacking detail on important matters such precisely what evidence is needed to support an outlier; how ‘moderation mutuals’ might be set up; how, in practice, a school might be influenced along the lines of ‘those grades look rather high…’; how boards might most effectively engage in a dialogue with schools to enable them to explain their outliers. And many other things too. But these could all be addressed and resolved, and a practical procedure developed. Over the nine weeks between the original announcement of the cancellation (18 March) and the publication of Ofqual’s consultation decisions (22 May), there was plenty time to get things right.

All that, of course, is water-under-the-bridge. Yes, I can hear the chorus of ‘It’s all very well for this smart Alec to be wise after the event! And who is he anyway?’ Indeed. So don’t rely on what I’ve written. Think about it for yourself.

But it isn’t quite all water-under-the-bridge yet.

To me, a very important and as yet unresolved, issue concerns appeals.

To me, a very important and as yet unresolved issue concerns appeals. After much public pressure, the IB has announced it will ‘work with schools to review “extraordinary cases” for appeals’.

Before public pressure becomes overwhelming, should not Ofqual act to provide a safety net allowing appeals not merely on narrow technical grounds, but on the fundamental grounds of unfairness?

The author gratefully acknowledges his conversations with Huy Duong, Mike Larkin, and – especially in connection with ‘no grade inflation’ – Rob Cuthbert.

24 comments

Michael Bell says:
23rd July 2020 at 10:31
Dennis, an excellent piece. And you’ll know my response to your final question.
Yes, Ofqual should allow for individual appeals on the fundamental grounds of unfairness. Let it be evidence based, of course, but let it happen. Otherwise the grades this year are likely to be very unfair for a number of students, and their future plans potentially dashed or fundamentally impacted when they have no right of appeal or recourse.
Reply
Colin McCaig says:
23rd July 2020 at 11:19
Excellent blog
“The boards then send a form to each school saying ‘For this year’s [23] candidates in [GCSE Geography], your school is allowed [so many] 9s, [this number of] 8s… Please enter the names of the candidates to be awarded each grade.’”
This effectively means that Ofqual are applying back-door norm-referencing – based on the historical norm of the school – rather than criterion referencing, under which every piece of work that met the criteria for a 9 would be graded a 9. (Criterion referencing replaced norm referencing in 1983 when even a Thatcher government was happy to make the system firer).
So in effect government is throwing out the ‘baby’ of achievement (as judged by teachers) with the ‘bathwater’ of perceived grade inflation.
Along with today’s OfS warning about the evils of unconditional offers, this looks like an attempt to squeeze out marginal students in the name of preserving excellence in the ‘world class system’.
When Williamson and Donelan made their recent rhetorical assaults on widening participation, supposedly independent regulators swung into action without missing a beat!
Reply
Tania says:
23rd July 2020 at 14:18
Thanks, Dennis, for a thoughtful article and interesting link to the symposium slides.
I think the approach seemed to make sense when it was first proposed. Teachers know best what the students *should* get in an ideal assessment system, standardisation would provide the checks and balances, and the option to do exams in the autumn seemed like a robust safety net.
BUT “the devil is in the detail”. It now appears that the statistical approach may have a more significant impact than might have been expected. The level of variability at exam centres, particularly at non-selective state schools makes that a very blunt approach. The fact that most schools stopped teaching GCSE/A level content in March means that the autumn exams will be a very un-level playing field.
I agree that it would have been better to have started with a “grade budget” for the schools based on historical performance and cohort ability. Dennis is suggesting that could have been by subject, and I think that would be good with the addition of an option to shift some, say 5%, between subjects. It is obvious that statistical variation by subject is going to be greater than statistical variation for the whole centre. Illustrated by a local school which one year had 2 people doing Product Design, both of whom got A*, then the next year 6 candidates with none getting more than C.
The thing which feels wrong now is that teachers spent time thinking about individuals and their abilities, but in the statistics everybody is a number and the “top down” approach will lead to unfair results, although hopefully the rankings will prevent them from being as unfair as some of the IB results. If the schools had been given a grade budget they could have distributed them as fairly as possible and, as Dennis suggested, had the option to negotiate more with evidence provided for change.
Of course OFQUAL should allow appeals for “wrong” results if supported by a school.
Reply
Huy Duong says:
23rd July 2020 at 14:53
Ofqual has done a good job of the effective chops and changes to the teachers’ predicted grades to arrive at the 2% grade inflation for A-levels.
My analysis of Ofqual’s published data shows that between 36% and 43% of each predicted grade will be downgraded, except for the predicted grade E.
In total, 39% of the predicted grades will be downgraded and 61% will remain the same.
Please see:
https://sites.google.com/view/2020-ofqual-grade-calculation/ofqual-grade-calculation-problems
What worse is due to the limitation of statistics when dealing with small numbers, much of the chops and changes will be significantly random. So the awarding of grades this year has a strong lottery element.
I am surprised and disappointed that educational leaders and analysts have not expressed appropriate concerns.
Reply
Huy Duong says:
23rd July 2020 at 15:05
Sorry, there was a typo in my comment. One of the comments should have been:
“Ofqual has done a good job of hiding the effective chops and changes to the teachers’ predicted grades to arrive at the 2% grade inflation for A-levels.”
Reply
Dennis Sherwood says:
23rd July 2020 at 20:44
Hi everyone – thank you for all your comments!
Huy – those statistics make a lot of sense. All the grades get shifted down, and candidates above every grade boundary will get shifted below it. There are perhaps some weird CAG distributions that behave differently – for example, a school that submits only A*s and Ds, which the model changes to a spread of A, B, C – but those will be very rare I think.
Craig’s point about ‘back door norm referencing’ is a vivid description: if Ofqual were intending to enforce “no grade inflation” strictly, as they appear to be doing pretty well, why didn’t they just say so directly, so everyone knew the rules?
And that’s a great idea, Tania, about applying “no grade inflation” at the level of the school, rather than subject-within-school! That makes all the cohorts much larger, so the statistics are more reliable, and could well largely solve the “small cohort problem” at a stroke. Also, it gives schools much more flexibility in dealing with all sorts of special cases. If only Ofqual had sought your advice in March!!!
Reply
Huy Duong says:
23rd July 2020 at 23:31
Norm-referencing at the centre-subject level is silly and unjust. I wonder if Ofqual tested their model by trying to see how well it predict A-level results of 2016, 2015, 2014, etc, cohort? If Ofqual ever releases details about its model and someone does that kind of testing for a school and finds that it fails miserably, as will most likely happen for hundreds of typical schools, could Ofqual be sued for negligence? Ofqual’s claim that its process is the fairest possible is probably false. For example, Scotland has a fairer appeal stage, so it’s possible for Ofqual’s process to be fairer than it currently is by having a fairer appeal stage.
Reply
Rebecca Stevens says:
24th July 2020 at 08:56
Thanks for the blog
It does recognise, as you say, that hindsight is a wonderful thing!
I have to disagree with most of you in saying that ofqual, in the time scale did an impressive job and made important and difficult decision wisely.. Additionally, as ever, they allowed consultation which modified their proposals, most notably around the use of CAGs
Their role in statistically moderating grades is, I would argue, the most crucial part of the process. Without it, 2020 grades would lack any credibility and be unfair to current students as well as those from past and future years.
The limited appeals process was put in place to free teachers up to predict and grade fairly, alongside the removal of league tables and performance management based upon results. I’d argue sensible decisions to make the process robust
Is it perfect? No. Is the usual exam system fair. No! Here grade boundaries do the job of cutting and slashing grades, based on one e am, one day.
Is it probably the best model they could have developed in a month… Yes!
Reply
Huy Duong says:
24th July 2020 at 10:50
My question is: Is the reduction of A-level grade inflation by 10% (from 12% to 10%) worth downgrading between 36% and 43% of every of the teachers’ predicted grades except grade E?
Given that typical A-level centre-subject cohorts have between 10 and 30 students, the statistical confidence for between 36% and 43% of every of the teachers’ predicted grades except grade E must be extremely low for most cases. This means that this culling is extremely unreliable.
Even if we suppose hypothetically that the culling gets it right in three quarters of cases, Ofqual’s achievement of reducing grade inflation by 10% will have been achieved by downgrading the wrong 10% of A-level entry. What kind of achievement will that be?
Reply
Dennis Sherwood says:
24th July 2020 at 12:50
Hi Huy – you could be right.
But let me float another possibility. Let’s suppose there’s only one school, with only two candidates. They’re good, but not that good – so let’s suppose that the grades they truly deserve (in so far as anyone can assess that, but let’s ride with that for the moment) are A and B.
The teacher wants to give them the ‘benefit of the doubt’, and submits A* and A. The statistical machine does it thing and awards A and B. The machine is right, despite both grades being pushed down.
Another possibility. The teacher is worried that the parents will get angry. So the submission is 2 A*s – deliberately, in the expectation that the model will downgrade, which it does. And when the parents complain, the teacher says “I submitted A* – it’s nasty Ofqual that you should blame, not me”. There are other ‘games’ too.
And on top of all that are the unfairnesses built into the model itself.
A BIG PROBLEM with what has happened this year is that it is now impossible to untangle these explanations. They are all plausible, and I suspect all have happened to some extent. But we’ll never know.
Regrettably, I think the teaching community bear a lot of responsibility for this. Even despite the obscureness of the rules, you don’t have to be a rocket scientist to anticipate that “no grade inflation” will be the rule. So any school that submitted above the historic average was asking for trouble. Not enough people read the “Isaac” blog (https://www.hepi.ac.uk/2020/05/18/two-and-a-half-cheers-for-ofquals-standardisation-model-just-so-long-as-schools-comply/! That was posted on 18 May, two weeks before the window for the submission of grades between 1 and 12 June.
Not only that: there was no ‘policing’ of ‘gaming’. ASCL, HMC, NAHT and the rest didn’t exercise any oversight – maybe they don’t have the authority – let alone leadership (although Geoff Barton at ASCL did say sensible things). And that can be foreseen too (https://www.hepi.ac.uk/2020/04/04/weekend-reading-a-for-ofqual-and-the-sqa-this-years-school-exam-grades-could-well-be-the-fairest-ever/).
So each school did their own thing, independently. A true “Tragedy of the Commons” (https://www.tes.com/news/exams-gcse-alevel-grading-issue-risk-concern).
Ofqual should have foreseen all of this, and designed a process that would have worked better – something else that could have been done, as this blog has outlined.
Oh dear.
Reply
Tania says:
24th July 2020 at 13:43
Huy Duong – that is a complex question, I assume you mean reducing grade inflation from 12% to 2%.
What everybody wants is for candidates to get the right grades and for those grades to be considered equal to grades awarded in a normal year. If there are almost twice as many A*s as usual, surely that will diminish the value of the grades in this year? Also it would reward schools which were less rigorous in reviewing their CAGs and potentially cause problems for universities.
I do not think it is surprising that a third are coming out high. I can use my son as an example. He was predicted A*A*A* for UCAS last year and I think that if he was in the current cohort, his CAGs would probably have been A*A*A*. In nearly all past papers he was within a few percent of the grade boundary, usually on the right side. Last year the Physics grade boundary went up so he ended up with actual results A*A*A. Those are fine grades and probably reflect somebody who is a marginal A/A* for all subjects better than if he had met his predictions. If he had been downgraded for 1 subject from his CAGs (ie by 33%) it is likely that he would have got the grades he deserved.
It would be nice to know that the system would pick up how many results are moderated for each student – many would be fine if one of their predictions were downgraded, but it would be tough if all 3 went down. That is part of the reason schools looking at each individual candidate rather than an anonymous statistical approach is better for fine tuning.
Reply
Dennis Sherwood says:
24th July 2020 at 18:35
Hi Rebecca
Many thanks for contributing to this discussion. You and I are in probably rather different places, which is fine – it’s the debate that counts!
But I certainly agree with you that some form of moderation was necessary. And as I stated in my response to Huy, the current muddle is not “all Ofqual’s fault” – the teachers, perhaps inadvertently, perhaps somewhat deliberately (those that were playing games), made their contribution too.
As regards appeals, my view is that any appeal should be about the actual outcome – the grade as awarded – and not about the process, or about whether that outcome was the result of a teacher’s judgement or the result of a statistical algorithm. The central concept is fairness, and there is an obligation on the appellant to produce appropriate evidence of an unfair outcome. In general, that is hard to do – but it does provide a safety net that is not, I believe, there at present.
So, thank you once again – may the debate continue!!!
Reply
Rebecca Stevens says:
25th July 2020 at 08:58
Thanks Dennis.
It is a fascinating debate and of high, well informed quality in this arena.
I do worry about :teacher safety, student confidence, if money selling headlines hit the tabloids… or even worse The Guardian!
Another great point that has been largely ignored by the odious over-opinionated press, but is explained brilliantly in your 2019 article I believe, is the in accuracy of our usual examination system. Suddenly that has become the pillar of reliability and validity.
I suppose I am accepting (with a great deal of terror as both a teacher and a parent of examination age pupils) that every exam grade is an estimate and should probably be issued with a confidence interval.
More recognition of this would be helpful to soothe tensions on 13th and 20th August.
Reply
Dennis Sherwood says:
25th July 2020 at 09:15
Thank you, Rebecca.
And yes, we are in total agreement on all that!
May I float an idea, please?
The ‘confidence interval’ concept is, I think, powerful. But have the ‘powers’ considered it? I don’t know. And to me the key power is Robert Halfon, and the Select Committee. My sense is that, following Ofqual’s refusal to implement their recommendation to “be transparent and publish the model immediately”, they might be feeling a bit peeved, and might be “interested” in further “ammunition”…
If that makes any sense, do you have any thoughts on how the Select Committee might be briefed on this?
Reply
Tania says:
25th July 2020 at 10:23
Dennis, I would like to defend the teachers!
They don’t know which students would underperform on the day and I suspect that for almost every student there will be genuine evidence of results straddling grade boundaries and there are surprises in the results for the teachers every year.
As I said above, A*A*A was a fair reflection of my son’s performance at A level, but for each subject an A* was more likely than an A, so the CAG for each subject should have been an A*, but it would be appropriate for 1 to slip in the standardisation process.
This is a weakness of the subject focused rather than candidate focused approach. I don’t know how you would resolve that when a lot of candidates are entered for exams with more than one Board.
On the “confidence interval”, it would be great and I would like to see the end of cliff edge results, but people like to see a very explicit system which they think they understand (of course people who follow your work know that is an illusion!). UMS was very helpful for making comparisons possible, I think Cambridge university managed to use it effectively, but most people did not understand it. Confidence intervals are even more complicated even though they are a more truthful reflection of exam results.
Reply
Dennis Sherwood says:
25th July 2020 at 11:09
hi Tania – well said, thank you, and I too believe in teachers – https://www.hepi.ac.uk/2020/03/21/trusting-teachers-is-the-best-way-to-deliver-exam-results-this-summer-and-after/!
I’m sure we’ll agree that one of the BIG PROBLEMS with exams is how the student will “perform on the day” – I certainly remember some of my off-days. So, to my mind, the more that the student can be assessed ‘in the round’, the better. And a wise teacher, in a widespread culture of integrity, is likely to be best placed to do that.
Looking ahead to a bright-new-future, it’s likely that there will be a sensible place for exams in some form alongside teacher judgement – in which case, as I’m sure we’ll agree again, they should be assessed reliably.
I agree with you about the issue of interpreting the confidence interval, and when I was present at a big-wig discussion of the possibility, it was dismissed as too difficult to understand and likely to be the cause of confusion. So those big-wigs took the view it was preferable that 1.5 million grades should be wrong year-in-year-out than to provide reliable information. I despaired then; I despair now.
If a confidence interval were to be used, perhaps the UMS score as the central figure – say, UMS ± x – then my hunch is that everyone will adopt the upper figure UMS + x as the key number, giving candidates ‘the benefit of the doubt’, which is fine with me for GCSEs (if they continue!) and A levels. Although I might myself prefer UMS – x for brain surgeons, plumbers and the driving test!
Oh – and on re-reading this, I’ve made some presumptions where I’ve said “I’m sure we’ll agree”! Well, I hope we’ll agree… !!!
Reply
Huy Duong says:
27th July 2020 at 00:29
Hi Dennis,
You wrote
“A BIG PROBLEM with what has happened this year is that it is now impossible to untangle these explanations. They are all plausible, and I suspect all have happened to some extent. But we’ll never know.”
Very true. But scientific thinking, and indeed honesty, requires that when we don’t know we don’t say what we do. What Ofqual is doing is unscientific. Furthermore, it is dishonest in that Ofqual gives the public the false impression that its “standardisation” has somehow entangled the mess and made this year’s grades consistent between centres and between years. But that is unlikely to be true for A-level subject cohorts of up to at least 30 students at non-selective state schools. The reason is the natural year on year variation in results at the centre-subject level is too great for statistical modelling to make sense. You can see this from the 2017-2019 A-level results of a school in Oxford with 1100 students (Year 7 to 13) at the link below:
https://sites.google.com/view/2020-ofqual-grade-calculation/data-from-a-typical-comprehensive-school
Reply
Huy Duong says:
27th July 2020 at 01:14
Tania,
If my teachers predict A*A*A for me, there is a big difference between me getting A*AA in the exams and me getting A*AA because of what A-level candidates in 2017-2019 did in their exams. The association between me getting A*AA in the exams and my ability is a lot stronger than the association between the fluctuating grade distribution from 2017 to 2019 and my ability.
I don’t think we should accept a 77% inflation of the A* grade, but we need two things. First, we need to know whether Ofqual’s standardisation is statistically sound. Second, even a statistically sound process can get it wrong, so there should be a appeal process for students who have been downgraded that considers each appealing student’s ability. If the standardisation is statistically unsound, it will be a kangaroo court with no appeal.
I believe that as a democratic society we should not have a potentially kangaroo court that condemns a proportion of students to the wrong grade in order to maintain the value of grades.
Note that the statistical model will downgrade some students wrongly, which means some students are not downgraded when they should have been, so even with the standardisation, the value of grades will still be downgraded inasmuch as the statistical model is unreliable. But now we still don’t know how reliable the statistical model is, so we don’t know how much of a kangaroo court it is, and we don’t know how much value for the grades the standardisation has maintained.
Reply
Huy Duong says:
27th July 2020 at 01:40
Rebecca,
I don’t disagree with the principle that the teachers’ predicted grades need to be standardised. However I worry that Ofqual might be too gung-ho about it and stray into statistically unsound territory and downgrade too many wrong students, while not giving the downgraded students the safety net of an appeal.
I wrote “gung-ho” because the reduction from 12% grade inflation for A-level to 2% seems very ambitious given the difficulty of determining the over-predicted grades.
I wrote “worry that Ofqual … and stray into statistically unsound territory” because of the small numbers involved for a lot of centre-subjects and approximately 40% of predicted grades being degraded.
It is true that Ofqual and the exam boards have to perform standardisation every year. However, in previous years the standardisation was inter-centre: for each subject, an exam board would use nation-wide data for the cohort being assessed to set the grade boundaries, so in the past the same set of grade boundaries, i.e, the same standard, would apply to every school in the country.
This year the standardisation is primarily intra-centre: Ofqual and the exam boards will, in effect, set one set of grade boundaries per subject per school, using only that school’s data, and a major component of that data, i.e, the 2017-2019 data, is not about the cohort being assessed. What the proposed standardisation does is to force the grade distribution for a subject at a school to be consistent with the grade distribution at that school in the past three years. However, due to the weak links between the grade distribution in the past three years and the current cohort’s ability, the supposedly standardised 2020 grades for that school is not necessarily consistent with those of previous years. Furthermore, the supposedly standardised 2020 grades for that school is not necessarily consistent with those of other schools for this year either.
Reply
Tania says:
29th July 2020 at 11:33
Huy
“If my teachers predict A*A*A for me, there is a big difference between me getting A*AA in the exams and me getting A*AA because of what A-level candidates in 2017-2019 did in their exams. The association between me getting A*AA in the exams and my ability is a lot stronger than the association between the fluctuating grade distribution from 2017 to 2019 and my ability.”
Actually it is the same. The reason my son got an A instead of an A* for Physics was because the grade boundary was higher than it has been for the last (at least) 3 years and fewer A*s were awarded in 2019 due to Board moderation decisions. Nobody outside the board understands why you needed 219/270 in 2018 (8.61% of cohort) and 230/270 in 2019 (7.79% of cohort) for an A*. He got 227 (84%), which would have been an A* in any other year we looked at, so it still felt like a random thing.
Until you are affected by the randomness of exam grading, everybody thinks it is a clear and transparent thing!
Reply
Dennis Sherwood says:
29th July 2020 at 12:06
Hi Tania and Huy
Curioser and curiouser… Reference to JCQ (https://www.jcq.org.uk/examination-results/) shows that, for A level Physics in England 8.5% of all students were awarded an A* in 2019, and 9.3% in 2018.
So, two things are perhaps going on.
Firstly, a pretty harsh overall “no grade inflation” in 2019: pushing down from 9.3% to 8.5% is a big jump. And that 0.8% difference, applied to the 2019 cohort of 36,021 candidates implies that about 248 (actually, I mean about 250!) candidates were awarded an A in 2019, but would have been awarded an A* in 2018.
Secondly, there is a difference between the 2019 All-England (= All-board) figure of 8.5%, and your son’s board’s figure of 7.79 = 7.8%. That means, I think (please help me out here!), that the other boards had a higher percentage than 8.5%. That in turn means that some great-power-in-the-sky took the view that your son’s board set a pretty easy A level Physics exam (!), didn’t they? I’m sure your son will agree!!!
That last point is all about the, to me, absolute lunacy of having a “competitive market” in exams.
When I was working at Ofqual in 2013, I did a study of the exam board “competitive market”. And I hit a problem. I couldn’t discover any basis for competition, other than price. They can’t (explicitly anyway) offer different standards; or ‘service levels’ (like ‘we get the results to you sooner’). Which meant that any “competition” was very much around-the-edges, and possibly dodgy, such as the provision of seminars and teaching aids – designed largely, I fear, to give the participants the impression (or the actuality) of having some kind of insider advantage, and driving dysfunctional behaviours in schools.
And the existence of all those boards means the Ofqual have to scrutinise that many exam papers in each subject, and then have the headache of trying to “standardise” across the boards.
Interestingly, maybe the fuss about Ofqual’s lame attempts to “standardise” across schools – which has been carried out somewhat in the public gaze, albeit through hazy glasses – will cause someone to say “how do they do it across boards, and might it have similar flaws?” For the cross-board standardisation is done totally in secret…
…and is a self-inflicted wound attributable to the political dogma of “competition”.
Funny old world…
Reply
Tania says:
29th July 2020 at 14:02
Dennis, that sounds like some homework for me to look at which boards gave all the A*s – not that there is anything I can do about it. He did OCR “A”. I think AQA might have the highest number of candidates. I think his school choose the ones with the syllabus they like rather than the ones which might give better grades.
Clearly competitive boards are another nonsense which Scotland and Wales don’t have.
Reply
Ralph Hains says:
11th August 2020 at 20:34
Rebecca,
Forgive me if I have misunderstood, but the whole point of Dennis’ article is that this is foresight, not hindsight. I can see it coming as can many others. Those that cannot appear to me to be wilfully refusing to, rather than it being hindsight. Hindsight is a far less wonderful thing than foresight. There are a load of SMART ways that the system can be improved before the train crash on Thursday, for those smart enough to see them (and Dennis has outlined two, both better, and both capable of further improvement with a wide appeals mechanism.
Reply
Dennis Sherwood says:
20th August 2020 at 17:29
Hi everyone – thank you all for your posts!
I write this on 20 August, GCSE results day.
Sooner rather than later there will be a reckoning… and maybe this blog, and everyone’s comments, might be used as evidence to counter the claim that “the process that was designed was the fairest and best possible under these unprecedented times…”
Reply

24 comments

Leave a Reply Cancel reply