- This HEPI guest blog was kindly written by Roger Watson, Academic Dean, School of Nursing, Southwest Medical University, China, Dean Korošak, Vice-Rector for Science and Research, University of Maribor, Slovenia and Gregor Štiglic, Vice-Dean for Research, Associate Professor and Head of Research Institute, Faculty of Health Sciences, University of Maribor, Slovenia
We live in a culture where performance review and assessment are ubiquitous. In some occupations, the parameters for such review are relatively easy to derive. Salespeople can be assessed based on how much they sell, recruitment consultants can be assessed based on how many people they recruit, and manufacturers can be assessed on the basis of how many products they make. These ideas have been naïvely imported into academia where individual performance, especially regarding research, is assessed by means of volume and metrics.
It is understandable why this should be the case. Using numbers is easy and volume and metrics provide numbers whereby performance can purportedly be measured and compared. The problem regarding this approach to measuring performance in academia is that the numbers, while not entirely or always meaningless, can become meaningless if the wrong value is attributed to them. Also, unlike sales, recruitment or production, numbers can be gamed, and it is glaringly obvious in some sectors of academia and in some countries in Europe that this happens.
It is hard to be over-critical of people who adopt the ‘publish or perish’ mentality and game the volume and metrics system of performance review in academia. After all, they neither established the system nor set the parameters by which their performance is evaluated. If they are seeking continued employment or promotion then they can only meet the targets which are set for them. However, this does not make it right.
The prime examples of volume whereby individual research performance is assessed are publication output and the metrics are citation based and inevitably related to the impact factors of journals, the total citations accrued and the individual person’s h-index. Volume of publications is easy to game and ensuring publication in high impact factor journals, while less so, but not impossible. The h-index is less malleable, but it is also not impossible to game it.
Therefore, provided we persist in evaluating individuals by volume of publications, they will persist in publishing as many articles as possible. The problem here is that the quality of the research being reported therein is often unrelated to the volume.
We are not alone, and we are not the first to recognise this problem or to try to find a solution. Also, universities across the world pay lip service, for example, to eschewing or at least reducing the dependence on the use of impact factors to evaluate research outputs. Many have indicated their agreement this effect by signing the Declaration on Research Assessment which specifically argues against the use of journal impact factors in the assessment of individual academic performance and also, in Europe the CoARA (Coalition for Advancing Research Assessment) which specifically argues in favour of peer-review of research outputs. Nevertheless, they persist in using impact factors to assess individual performance and, as a result, individual academics continue to game the system.
With the above in mind and recognising that a system which is perfect, and which pleases all universities and individual academics is elusive, we propose some ideas which may help to mitigate the problems inherent in the over-dependence on volume and metrics bases systems of individual research performance.
The reliance on volume could be mitigated by restricting the number of published articles that individual academics are required to submit for appraisals and for promotion. Clearly, the frequency of appraisal and the seniority of the individual should be considered. Targets should be set in advance and there should be an emphasis on quality rather than quantity. Highly productive academics should not, necessarily, be discouraged from publishing a high volume of articles but they should be behoved to select what they consider to be their highest quality articles, whatever number is required, for evaluation.
Universities should insist that academics publish in high quality and reputable journals, for example those included on the Clarivate list, the Emerging Sources Citation Index, the Directory of Open Access Journals, or similar. But evaluation of articles should not be based on metrics, either citations or the impact factor of the journals in which the articles are published. Unavoidably, if quality of articles is to be evaluated then a process of peer review needs to be established. Clearly, this is more labour intensive than using metrics but restricting the number of articles should facilitate the process. Naturally, the validity of any peer review system can be questioned but the United Kingdom Research Excellence Framework has demonstrated, by means of calibration exercises, that the system is reliable. It limits the number of outputs that can be submitted by individuals, and it is considered fit for the purpose of allocating government research infrastructure funding to universities. This could easily be adapted to the evaluation and allocation of credit to individual academics.
Issues remain to be resolved such as multiple authorship of articles, especially if some of those multiple authors are from within the same institution. Nevertheless, this is not an insurmountable problem if precise targets are set regarding outputs, impact and lead and co-authors are identified in advance. A tariff could be established for the relative contribution individuals have made to specific outputs. This is possibly even more crucial in countries where the number of publications per individual researcher has been manipulated to such a degree that regulatory agencies responsible for evaluation of research performance have been forced to eliminate all authors but the first and last from the metrics.
As we indicate above, a perfect system for evaluating the performance of academics is an unlikely prospect. But we consider that such a system has not been tried and found wanting; we have failed to try it at all. Whatever systems are evolved the main point we make is that we must move away from virtually meaningless, demonstrably flawed and gameable systems of volume and metrics to assess the performance of individual academics. If such systems were tried, tested and adapted to individual institutional needs and then good practice shared and spread then there could be a positive ‘knock on’ effect on academic publishing.
Currently we live under the tyranny of the 5,000 word peer reviewed manuscript which continues widely to be the gold standard form of proof that a substantial contribution has been made to the literature in a particular field of science. The increasing number of retractions of articles and the widespread violations of publication ethics suggest that this kind of output may no longer be fit for purpose. In an age of pre-printing, data and analysis source code sharing, study registration and Wikipedia type sites which can be edited by multiple authors and fields develop, surely the 5,000 word peer-reviewed dinosaur is antediluvian, to say the least. Even more troublesome is the fact that we have known about this for some time, yet nothing has been done.
Publishers fear a loss of profits from the endeavours of academics and editors fear a lack of control over those endeavours. However, both publishers and editors have a place providing, respectively, the platforms where by data, registration, pre-printing and updateable records of knowledge can be curated and editors still have a role in ensuring the quality and propriety of what enters the public domain, in prioritising science communication thus fostering an environment of trust and understanding. It strikes us that universities, academics, publishers and editors have nothing to fear and a great deal to gain.