This HEPI blog was kindly authored by Dale Bassett, Director of Assessment at United Learning. It is based on a thread on Twitter/X. For more on this topic, see HEPI and UCAS’s recent webinar.
A level results are published today and GCSE results next week. The system can be quite opaque, particularly when it comes to how grading standards are set. This blog sets out what you need to know about how it works and what you can (and cannot) read into the results.
The results in each subject are set using a principle called ‘comparable outcomes’, although Ofqual tends not to use the term any more. The idea is that a pupil achieves the same result this year as they would have done last year.
This is done so that results are fair to pupils from year to year. It is important because pupils’ performance can vary from year to year due to extraneous factors, like harder exam papers, new specifications, or pandemics. This year’s standards (and last year’s) are broadly anchored to 2019, so the cohort is protected from lost learning as a result of Covid: even if students’ performance is lower, grades are maintained.
So, how do exam boards achieve these ‘comparable outcomes’? Grade boundaries are set every year after exams are sat and marked. Because the difficulty of exam papers varies each year, grade boundaries move up or down to compensate for this. So if the paper was harder, boundaries would be lower, so that the same proportion of students would get an A (for example) as would have done if they’d sat last year’s (easier) paper.
Grade boundaries are also tweaked to adjust for the cohort, using prior attainment data (KS2 data for GCSEs; GCSE data for A-levels) as a proxy for ability. So if for example, this year’s GCSE cohort is slightly more ‘able’ than last year’s (that is, they had slightly better KS2 results) there will be more top grades awarded. Again, this helps to ensure that a grade 6 (for example) is comparable from year to year. This is done at a national level. Individual pupils’ prior attainment has no bearing on what grade they get as an individual, which just depends on how many marks they got and the grade boundaries for the qualification, which are the same for everyone who takes it.
So, the ‘starting point’ for this year’s standard is essentially last year’s standard, tweaked to reflect the ability of the cohort. There are other factors that can then move the standard – although this happens rarely, especially in large-entry subjects.
The most important of these is senior examiners’ judgement. If senior examiners see evidence in pupils’ scripts that the standard has noticeably moved (i.e. if students are genuinely performing better than in the past) they are able to argue that the standard should be moved.
Ofqual also runs a National Reference Test (NRT) to gather independent evidence of Year 11 students’ performance in English and Maths. It could use this to adjust the standard, although it never has done so far (the NRTs have been in use since 2017).
Sometimes there are intentional tweaks to the standard. This year, grading will be slightly more generous in GCSE French and German (to move standards in line with Spanish) and Computer Science (to keep standards consistent after a partial reform of the qualification). And, of course, there was a deliberately different standard in 2022, as Ofqual sought to incrementally move back to the pre-pandemic standard, after the cancellation of exams in 2020 and 2021.
However, in most cases, the standard is last year’s standard, tweaked for the cohort. Because of this, all you can really read into changes in national results is that the cohort is different to last year’s. Inevitable media coverage about standards going up or down is, in most cases, nonsense.
The standard is the same as last year; if national results are slightly up or down, that reflects the nature of the cohort. Not that exams are easier or harder. Not really even that pupils performed better or worse. It just tells you that the cohort taking the exam has changed.
For example, there are more entries for GCSE Spanish this year, probably because of the EBacc policy. We might assume that many of the ‘new’ pupils doing Spanish are likely not to be high-performers (since if they were very good at languages, they might have chosen them anyway). If this is the case, the average performance in Spanish will have decreased this year, and national results will go ‘down’. But this tells you nothing about the standard of GCSE Spanish, which won’t have changed – it only tells you that a less-able cohort took the qualification.
Finally, remember that a GCSE or A-level grade tells you little or nothing about specific skills. They are compensatory assessments (you can ace one part, flunk another and come out with an ‘average’ grade) so you can’t infer specific knowledge or skills from a grade.
Some further information, that might be relevant as regards understanding A level and GCSE results, is that provided by Ofqual’s then Chief Regulator, Dame Glenys Stacey, in evidence to a hearing of the Commons Education Select Committee held on 2 September 2020: in reply to a question concerning the reliability of grades, Dame Glenys stated that grades “are reliable to one grade either way” (about 12:19:47 on https://parliamentlive.tv/Event/Index/a3d523ca-09fc-49a5-84e3-d50c3a3bcbe3).
If that is the case, this has significant implications, as discussed, for example, on https://www.hepi.ac.uk/2023/06/08/if-a-level-grades-are-unreliable-what-should-admissions-officers-do/.
Dame Glenys’s statement has, however, been dismissed as “erroneous” – see https://www.hepi.ac.uk/2023/08/14/how-reliable-are-exam-grades/.
In an attempt to resolve this, at a hearing of the Lords Education for 11-16 Year Olds Committee on 13 July 2023, Lord Mike Watson asked the then Schools Minister, Nick Gibb, in essence, “Are grades reliable to one grade either way or not?”. But before Mr Gibb could answer, the Division Bell rang, and the meeting was adjourned, as may be seen about 12:23:33 on https://parliamentlive.tv/Event/Index/67100d6e-e98f-4a32-be23-2e49f318ad42.
The Chair, Lord Jo Johnson, asked Mr Gibb to reply in writing; Nick Gibb did not.
Lord Watson’s question “Are grades reliable to one grade either way or not?” has, officially, remained unanswered to this day.
This is true in terms of collective fairness, as between one year’s cohort and the next, but silent in terms of individual fairness, as between one candidate and another in any one year, which is very different. Ofqual admit that grades are only reliable within a range of plus or minus a grade. Therefore someone with an offer of ABB from their chosen university but achieving BBC might have been graded anywhere in the range from AAB to CCD. See https://srheblog.com/2024/01/22/mr-sherwood-v-the-office-of-qualifications-and-examinations-regulation1/.
Confirms a need to do away with exams at 16 and the unnecessary costs that accompany such a system.
Finland do exams at 18 for entry to university and rely on on-going teacher assessment to identify when learners are ready to achieve, not fail.
An experienced teacher at Year 7 in Maths and English will be able to identify those who will be ready to progress and achieve GCSE’s at acceptable grades at the end of Year 11 or before as well as those whi will need longer – making learners sit trets and exams when not ready leads to liw estemm, depression and a lifelong alienation to learning.
Money saved from stopping exams at 16 will provide enough funding to make Secondary Education comparable with fee paying schools to improvecand utilise new technology now available for on-line learning and assessment, smaller classes and extra staff provided by those not needed by exam boards, starting teaching as a new career using their APL experiences and other qualifications.
Finkand dies nit havd fee paying schools either, preferring to have an excellent education system for all.
Being able to work on-line where approoriate and in small tutor groups will allow time for other life enhancing experiences, greater choice of subject to learn and interested in (not doing something because no spacec available elsewhere)and for subjects needing more practical application/support, as well as employability skills.. We would then produce more independent learners and perhaps innovative designers and critical thinkers…
I’m puzzled.
In an Ofqual blog dated 5 August 2024, the Chief Regulator writes “Grading for GCSE, AS and A levels returned to normal last summer, after disruption caused by the pandemic. This summer, students’ work is again being marked and graded in the normal way.” (https://ofqual.blog.gov.uk/2024/08/05/gcse-9-1-grades-to-uni-places-what-you-need-to-know-about-2024-exam-results/).
According to JCQ, the average % of A level A* and A grades in England, over the years 2008 – 2019 inclusive, is 26.2%, with a maximum of 26.8% (in 2010 and 2011), and a minimum of 25.2% (2019). (https://www.jcq.org.uk/examination-results/)
In England in 2023, the A*+A % was 26.5%, to my mind sensibly close to the 2008-2019 average to validate the Chief Regulator’s statement that “grading returned to normal last summer”.
Today we learn that the A*+A % in England for 2024 is 27.6%. That’s 1.1 percentage points above 2023, 1.4 above the 2010-2019 average, and 0.8 above the 2008-2019 peak.
If the standard applied this year is the same as that of the 2010s, are this year’s students substantially brighter than their 2010s counterparts? Or is there another explanation?