- This is an edited version of a speech given by Josh Freeman, HEPI Policy Manager, to the Cardiff University Biochemical Society Sponsored Seminar Series on AI.
I want to start with a thought experiment – one that will be on familiar ground for many of us. A lecturer sets an assignment and receives two student essays which are very similar in argument, structure, originality and so on. The difference is that one student used AI and the other didn’t.
The student who used AI used it, as more than half of students (51%) do, to save time. They knew what they wanted to say, wrote a bullet-pointed list, fed this into ChatGPT and asked it to generate an essay ‘in the style of a 2nd year Biosciences student’ – which is what we know that students are doing. Perhaps they added some finishing touches, like a bit of their own language.
The second student wrote their essay the old-fashioned way – they wrote a plan, then turned that into a draft, redrafted it, tweaked it and manually wrote their references.
The question is: Which essay should we value more? They are functionally the same essay – surely we should value them equally?
I don’t mean which essay should get the higher mark, or whether the student who used AI was cheating. Let’s assume for the moment that what they did was within the rules for this particular course. What I mean is, which essay better shows the fulfilment of the core purposes of a university – instilling intellectual curiosity, critical thinking, personal development in our students?
I think most of us would instinctively say that something has been lost for the student who used AI. We don’t value students as content creators. We don’t see the value in the essay for its own sake – after all, many of us have seen hundreds or thousands of similar essays in our time in academia. What we value is the process that got the student to that point. There is something fundamental about the writing process, that in the act of writing, you are forced to confront your own thoughts, express them, sit with them. You have to consider how far you really agree with them, or if there is something missing. Though the student who used AI produced the same end result, they didn’t have that same cognitive experience.
AI is, for the first time, divorcing the output from much of the cognitive process required to get that output. Before AI, if a student submitted an essay, you could be relatively confident – barring the use of essay mills or plagiarism – that they had thought deeply, or at least substantially, about the output they submitted. The content was a good proxy for the process. But with AI, it’s remarkably easy to generate the content without engaging in the process.
I was a teacher previously, and the mantra we were told again and again was ‘Memory is the residue of thought.’ (With credit to Daniel Willingham.) We remember what we think about. When you have to sit with an essay, or a difficult academic text, it fosters more learning because your brain is working harder. If you can fast-track the essay or just read a summary of the important bits of the text, you skip the work, but you also skip the learning.
This is a problem for all kinds of reasons, some of which I’ll go into. But in another way, it may also be a good thing. For a long time, the focus has been on the content that students produce, as the best marker for a students’ skills and knowledge. But I hope that AI will force us to think deeply about what process we want students to go through.
In the time I have left, I want to touch on a few issues raised by our recent survey, showing that the vast majority of students use generative AI, including to help with their assessments.
The first is that the rabbit is out of the hat. Almost all students are using AI, for a rich variety of purposes, and almost certainly whether or not we tell them they can. That will be obvious to anyone who has received a coursework submission in the last 18 months, but it is so key that it is worth emphasising. Barring the withdrawal of large language models like ChatGPT from the internet (unlikely) or the mass socialisation of our students away from GenAI use (also unlikely, but less so), AI is here to stay.
The second is that the system of academic assessment developed over decades or more is suddenly and catastrophically not fit for purpose. Again, this will be known to many but I am not sure the sector has fully grappled with the implications of it. All assessments had some level of insecurity, insofar as essay mills and contract cheating existed, but we have always felt these methods were used by relatively few students; we were also able to pass national legislation to crack down on these methods.
AI is different for two reasons. The first is ease of use – the barriers of seeking out an essay mill and coughing up the money are gone (though it remains true that the most powerful AI models still have a cost). The second is how students reckon with the moral implications. It is clear to almost everyone, I think, that using an essay mill is breaking the rules, so students would usually only use these when they are truly desperate. But AI is different. We saw in the report that there great uncertainty when it comes to using AI – lots of disagreement about what is acceptable or not. When it’s cloudy in this way, it’s easier to justify to yourself that what you’re doing is okay. Most people won’t overtly ‘cheat’ but they might push on hazy boundaries if they can tell a story about why it is acceptable to do so.
So all of our assessments need to be reviewed. I recently read an essay from UCL Law School, talking about how they will be using 50-100% ‘secure’ assessment, meaning in-person written or oral exams. This is a good start, though it may not even be enough if 50% of your assessments are ‘hackable’ by students with little or no subject knowledge or with no grasp of the skills you are meant to be teaching them. And I am not convinced that ‘secure’ exams are always such. If essay questions are predictable, you can easily use AI to generate some mock essays for you and memorise them, for example.
This is also why the claims that AI will generate huge efficiency gains for the sector are misplaced, at least in the short term. In the coming years, AI will put huge strain on the sector. Essentially, we are asking all of our staff to be experts in AI tools, even as the tools themselves constantly update. For example, AI tools hallucinate a lot less than they used to and they also produce fake references much less often – and there are now specific tools designed to produce accurate references (such as ChatGPT’s Deep Research or Perplexity.AI). It is an open question as to whether this radical redrawing of assessment is a reasonable ask of the sector at a time when budgets are tight and cuts to staffing are widespread – up to 10,000 jobs lost by the end of the academic year, by some estimates.
The third issue returns to the thought experiment I presented you with at the start. We will now be forced to think deeply about what skills we want our students to have in an age where AI tools are widely accessible, and then again about how we give our students those skills.
Think again of those two essays, one of which used AI and one didn’t. There is an argument in favour of the AI-assisted essay if you particularly value teaching AI skills and you think getting AI to help with essays is one way to enhance those skills. But like developing AI-proof assessments, this is a moving target. Some people will remember the obsession with ‘prompt engineering’ in the early days of GenAI – carefully crafting prompts to manufacture very specific answers from chatbots, only for them to update and all that work becoming entirely useless? By virtue of being natural language models, they are frequently very intuitive to use and will only become more so. So it is not at all clear that even the best AI courses available now will be very useful a few years into students’ long and varied careers.
The same problem applies to courses designed to teach students the limits of AI – such as bias, the use of data without permission, hallucinations, environmental degradation and other challenges which we are hearing lots about. Small innovations could mean, for example, that the environmental cost of AI falls dramatically. There is already some research saying a typical ChatGPT prompt may now use no more energy than a Google search. In a few years’ time, we may be dealing with a very different set of problems and students’ knowledge will be out of date.
I can’t pretend HEPI has all the answers – though we do have many, and we require all of our publications to include policy solutions, which you are welcome to investigate yourselves on our website. But my view is that the skills students will receive from a university education – critical thinking, problem solving, working as a team, effective communication, resilience – are as critical as ever. In particular, we will probably need to hone in on those skills that AI cannot easily replicate – soft skills of motivating others or building trust, emotional intelligence, critical thinking, which will endure in importance even as AI automates other tasks.
But the methods we use will need to change. We hear a lot from academics about the enormous administrative burden academics face, for example. In my view, the best case is that AI automates the boring bits of all our jobs – paperwork, producing lesson materials, generating data – and freeing us up to do what matters, which is producing innovative research and spending more time with students. That will make sure AI enhances, rather than threatens, the enormous benefits our degrees impart to students in the coming years.