The Metric Tide: rethinking research on research

Today the HEFCE publishes a report called The Metric Tide – a review of the role of research-based metrics in research assessment and management. The review calls for more research on research. Liz Allen, Head of Evaluation at Wellcome – and one of the report's authors – explores what this means.

One of the big takeaways from the HEFCE review is that metrics cannot replace peer review in the next Research Excellence Framework (REF). The review sets out the perils and pitfalls of relying too heavily on the quantitative indicators (typically bibliometric data) that are currently at our disposal for assessing aspects of research quality. Refreshingly, the sector was on the whole not opposed to the idea of using more research-related data to support exercises like the REF, provided these can be sourced efficiently and at minimal burden to the research ecosystem.

The emerging research data infrastructure should help us on this journey. Interoperability across various platforms for research management, and for reporting outputs and grant awards – aided by initiatives like ORCID – will help to connect research-related data. I hope that we start to think more broadly and holistically about research-related data. This is particularly pertinent as we think about how to respond to another recommendation from the report that "Research funders need to increase investment in the science of science policy".

One of the interesting observations for me was how unscathed peer review was throughout our deliberations. While the 'flaws and limitations' of peer review are acknowledged, we were unable to articulate with any confidence their magnitude. The report's conclusions reflect the prevailing view that metrics shouldn't be used without peer review to assess research.

To date, mainly due to a lack of comprehensive data, exercises like the REF have examined research excellence and impact through a limited lens. The UK REF panels assessed research that had already been funded ('ex-post evaluation', to use the jargon), but no data was available on the process of selecting the research for funding. (This was likely to have been a form of peer review – a process which effectively filtered out alternate research proposals.)

We wouldn't judge the success (outcomes or impact) of a vaccine without systematically taking account of the inputs and context of its use. The characteristics of the patient, their condition, the dose, the doctor, and a plethora of other factors could all have an effect. Yet in research evaluation we have tended to focus on research outputs and impacts where there is some systematic data – namely scholarly output and numbers of doctoral awards – but we have largely ignored the inputs and broader context. Information on the latter, such as researcher demographics, career stage, or type of grant awarded, is less readily available for analysis. Perhaps most importantly, aspects of the grant selection process – and there could be lots of variables here – are also missing from the analysis.

A recent study of National Institute of Health grants found that (at least in terms of publication outputs) peer review, beyond screening out poor applications early in the funding process, was not a good predictor for which grants would be most 'successful'.

There are many interpretations and implications of this conclusion. You might even ask whether beyond an initial (peer) screening to remove 'non-fundable' applications, funders could do just as well to randomly select from those that remain. This would certainly reduce the cost and burden on the whole research ecosystem, but it’s not a solution for the faint-hearted. It does, however, raise some interesting questions, especially in times of squeezed budgets and falling grant award rates.

  • Do our research funding processes leave a lot of potentially great work unfunded?
  • Are we good at selecting the types of research most likely to deliver the kinds of outcomes we desire? 

Without knowing more about the funding process, exercises like the REF are, in part, effectively validating research funding decisions that have already been made, and made in a peer review system where there is little in the way of systematic data or metrics.

A further challenge for research impact evaluation is that self-selection and reporting can paint a skewed image. A REF submission typically involves research institutions showcasing their best in class. The process doesn't make it easy for incremental research, research that didn't discover anything major, didn't work, or perhaps, wasn't thought 'impactful' enough to be submitted as a case study or get much air time – but that doesn't mean the funding was wasted.

The cost of national research exercises is great and comes in many forms, including monetary, administrative burden, and emotional energy for all involved. The impact element of the 2014 Research Excellence Framework alone is estimated to have cost institutions around £55m, which is almost as much as the entire 2008 Research Assessment Exercise (RAE), even when accounting for inflation.

The call for more 'science of science' is therefore exciting and timely – how might we make funding science more scientific? The question presents an opportunity for sector-wide engagement and collaboration to bring insight and efficiencies to how we fund research, what we should value in different contexts, and what does and doesn't work.

I hope we extend our thinking beyond current efforts to refine the 'indicators' we use to assess research outputs and impact once a funding decision is made. We need to open up the funding process to greater scrutiny and consider how innovation in the funding space, alongside new and evolving metrics, can be used to best develop, support and fund a thriving research base in the UK.