This post by HEPI Director Nick Hillman:
- considers the arguments for and against the use of big data in higher education;
- looks at the reasons behind the growing use of metrics; and
- ends with a discussion on the increasing demands to contexualise metrics before using them for important decisions.
The Tyranny of Metrics
Perhaps it is because the Office for Students have just published 640 pages on higher education metrics or perhaps it is because we will soon see the long-awaited Research Excellence Framework results, but I have found myself thinking a lot about metrics in the past few weeks.
In a recent blog post, I mentioned the persuasive book Super Crunchers (2007) by the economist Ian Ayres. It provides a clear case for learning new lessons from big data. If anything, the past couple of years strengthen the case, as we have only been able to understand the spread and impact of COVID-19 by collecting huge amounts of data.
Yet in the last few days I have been reading a more recent and equally persuasive book, The Tyranny of Metrics (revised version, 2019) by the historian Jerry Z. Muller. This puts forward the opposite case. It argues – a tad inelegantly – that ‘the best use of metrics may be not to use it at all.’
While Super Crunchers mentions how numbers were used in baseball to find out more about players’ performances than expertise could ever show, Muller says this made the game ‘more regular’ and therefore ‘more boring to watch, resulting in diminished audiences.’
The Tyranny of Metrics is full of examples of data being (over)used as an accountability tool in US and UK education (as well as in other public services). The author says this encourages people to game, distort and cheat as well as to focus too much on a small handful of specific issues rather than a broader range of priorities.
Muller is a historian and some of the history he squeezes in is fascinating, including the tale of how payment by results for public services began. In the mid-nineteenth century, the Liberal MP Robert Lowe linked schools’ funding to the performance of pupils in English and Maths.
Muller recounts how, immediately, Matthew Arnold warned Lowe’s policy would narrow education down to the things that were measured and have a disproportionate impact on poorer families. (Those who complain today about Progress 8, B3 or LEO make similar criticisms.)
Among the numerous complaints that Muller makes of how statistics are used is one that perhaps rings especially true in public policy. He notes:
the degree of numerical precision promised by metrics may be far greater than is required by actual practitioners, and attaining that precision requires an expenditure of time and effort that may not be worthwhile.
Later on, he warns: ‘the more you measure, the greater the likelihood that the marginal costs of measuring will exceed the benefits.’
It all reminds me of a piece last year in The Sunday Times by James Timpson from the shoe repair and key-cutting firm Timpson’s. Although his company’s turnover is similar to that of a smaller university, he wrote about the benefits of stripping data collection back to basics:
There must be a point where the costs of interpreting and using data exceed the benefits of collecting it. Can you afford a chief data officer paid £120,000 a year plus bonus? We can’t, so instead we have three simple ways of understanding what’s going on.
Every night at 7pm, I get an email listing that day’s sales. … our second barometer is customer service scores, which I look at every day. … One piece of data beats everything else. A quarter of a century ago, my dad taught me the best way to measure the health of our business was to look at the cash figure every day. … This fact offers no hiding place.
Timpson also warned it is a feature of failing businesses to seek out ever more data, much of which turns out never to be used, while losing sight of what matters.
The [failing] businesses we bought were often collecting vast amounts of data from their fancy tills, yet the managers were actually reading very little of it, and it rarely helped colleagues give better customer service. As sales plummeted, they analysed more data, and brought in more finance experts and consultants to work out where the problems were. Redundancies weren’t made from the data team — it was the people on the front line, serving customers, who lost their jobs first. These companies failed because they lost focus on what’s important: great customer service.
I was reminded of this when reading The Tyranny of Metrics because Muller notes how the demand for ever more data has grown unceasingly in education. He ascribes this to the fact that, there is no ‘in-built restraint’ against it. In contrast, in the business world excessive collection of redundant data is eventually seen to eat into profits.
Anyone reading the new documentation from the Office for Students might wonder if Muller has a point.
Why does higher education have a problem with data?
Aside from the demands of managers and regulators, there is surely another – generally overlooked – cause of excessive data production in our sector: the bottom-up demand for ever more data.
Instead of asking whether 640 pages is too much, people in the higher education sector are often tempted to look for clever-clever ways to show how any data that is collected is not granular or sophisticated enough. As a result, we end up implying 640 pages are not enough.
I think this problem stems from the positive fact that educators are used to critical questioning and searching out flaws, as a way of lighting up more intricate paths. But a problem arises when those alternative paths are too complicated for day-to-day use.
It can often feel like we’re in a never-ending cycle where data is produced, then criticised, then refined or replaced with something even more complicated which then risks being not fit for purpose.
If this sounds overblown, consider the sorry story of the useful HESA benchmarks. These contextualise important information about each institution’s performances and have proved useful over many years. The HESA website tells the full tale, which in an abridged form is as follows:
The UK Performance Indicators (UKPIs) are official statistics which help users compare the performance of universities and colleges against benchmarks. Current UKPIs include measures for widening participation in HE, and for student non-continuation. In the past, the UKPIs have covered other aspects of HE sector performance, such as measures associated with research and graduate destinations. … Performance Indicators were first developed and published for the 1996/97 academic year. … However, higher education in the UK has seen significant changes over the time period covered by the UKPIs. … Although development work on new indicators was undertaken, there were significant difficulties in reaching consensus on UK-wide definitions and deployment. This resulted in proposed new indicators never achieving approval for launch. … We have reached the following conclusions: No clear consensus has emerged on a new strategic vision for the UKPIs. There is a clear desire to see the lack of coherence between the UKPIs and formal policy and regulatory metrics resolved. … In view of these conclusions we have decided that the UKPIs require fundamental reform. … As a result of this process of reform, we are announcing that the next edition of the UKPIs in 2022 will be the last in its current form.
Perhaps we would be better off accepting data will always be imperfect. Rather than demanding contextual factors are reflected in ever more complex data, we could ensure they are reflected in the accompanying contextual information instead. We could then ensure that the contextual information carries at least as much weight as the numbers.
This is, in effect, what the Office for Students is proposing in their current consultation on the Teaching Excellence Framework or TEF:
We propose that in carrying out their assessments, panel members should interpret and weigh up the evidence by applying their expert judgement, guided by a set of principles and guidelines. We do not propose that they should deploy an initial hypothesis (a formulaic approach used in the previous TEF based solely on the indicators) or other formulaic judgement solely based on the indicators.
The world is getting better
To return to where I started, the one shared problem uniting Muller’s data-scepticism in The Tyranny of Metrics with Ayres’s dataphilia outlined in Super Crunchers is that they assume the worst.
In fact, it is possible to take on board the warnings about the dangers of overusing or underusing data to deliver something better, balancing the quantitative and the qualitative. (We have sought to do this in HEPI’s own work – such as our recent piece with Kaplan International Pathways on international students’ attitudes towards careers support, which rests upon quantitative polling and qualitative research.)
Beyond the new TEF proposals mentioned above, there are three positive signs that this balancing is now happening:
i. First, if you read it carefully, you will see the Office for Students’ Consultation on a new approach to regulating student outcomes maintains a big role for contextual factors:
c. The OfS [Office for Students] will consider whether it is satisfied that it has sufficient statistical evidence that an indicator or split indicator for the provider is below a relevant numerical threshold.
d. If so, the OfS will consider whether it has evidence that the provider’s context means that performance below a numerical threshold nevertheless represents positive outcomes.
e. If the OfS is not satisfied that it holds such information, it will seek further information about contextual factors from the provider.
f. If, as a result of the steps above, the OfS is not satisfied that context means the provider’s performance represents positive outcomes, it will make a provisional decision that initial condition B3 is not satisfied. The OfS will then consider representations from the provider before reaching a final decision.
ii. Secondly, in a thoughtful speech delivered at a HEPI / Elsevier conference in October 2020, the then Minister for Science, Amanda Solloway MP, noted the gradual improvements to research evaluation that had been made but also queried how metrics are used in the Research Excellence Framework, and announced a review of the process for evaluating research.
I have made a point of listening carefully to the research community over the last few months … You – and I have to say thank you for this – have not been backwards in coming forwards about the things that are getting in your way … The challenges you face in research – and the culture which is at the root of these. It is clear to me that many of you feel pressure from the wider evaluation system – pressure to demonstrate particular things to your peers and your superiors – things which sometimes make very little sense. … This gives rise to related issues – we know people feel pressured to show significant results from their work, to get it published, just to justify the effort and investment involved. This could be having a profound effect on the very integrity of science itself – leading to questionable research practices and evidence of a growing crisis in the reproducibility of research. … We have created this situation, in part because of the way we evaluate success. These are not new problems, but the good news is that the UK is leading the way in tackling them. … The REF exercise of today would be hardly recognisable to those involved in the early selectivity exercises of the 1980s. Although intended for simple purposes, universities have turned the REF into a major industry, with rising costs and complexity. … There are now very few parts of academic life in the UK that are not affected in some way by the REF. … Indeed, we know that 4 in 10 surveyed researchers believe that their workplace puts more value on metrics than on research quality. … we must be prepared to look to the future and ask ourselves how the REF can be evolved for the better, so that universities and funders work together to help build the research culture we all aspire to. … So I have today written to Research England to ask them to start working with their counterparts in the devolved administrations on a plan for reforming the REF after the current exercise is complete.
iii. Thirdly, Universities UK’s long-awaited paper on assessing quality is more interesting than its dull title, Framework for programme reviews, suggests for it seeks to take a panoramic view rather than a small snapshot.
It is worth looking at the short paper in its entirety but this one sentence stood out for me:
The use of metrics in the framework is principled, flexible where appropriate, and sensitive to both the limits of quantitative approaches and the importance of wider contextual information.
So away from all the hackneyed and clichéd whinges about the ‘neo-liberal marketisation’ of higher education, an optimist might say we are actually in the midst of an important shift in higher education policy.
Current initiatives typically reject both the old ways that (largely) ignored data but also the one-dimensional performance indicators common since the days of Major / Blair in the UK and Clinton / (George W) Bush in the US.
And that is surely something to celebrate in these dark times.
I like Nick’s comments about the use of metrics. They remind me of a talk I had a week ago with an academic friend and we were remembering the work of the CNAA long ago. I worked for the CNAA for many years. Peer review was central and it was a learning process for all participants (reviewers and reviewed). Metrics and other forms of data were there, but ‘talking and listening to people’ was central!