Reforming Research Assessment is a Systemic Challenge
Last month, more than 300 leaders from universities and funders came together in Copenhagen to discuss the need for change in the way research is assessed.
Since 2012, the Declaration on Research Assessment (DORA) and the Coalition for Advancing Research Assessment (CoARA) have consolidated a consensus that over-relying on publication metrics and journal prestige distorts research culture, discourages risk-taking, and undervalues the many other contributions that researchers make. But actual reform is moving slowly. Many institutions have made formal commitments, but in most places day-to-day assessment looks much the same as before.
This gap between principle and practice reflects the fact that research assessment is not a single policy choice but a system – and systems are remarkably good at reproducing themselves.
One useful way to think about this challenge comes from System Shift’s “4 Keys” model. In our view, changing a system requires simultaneous shifts across four dimensions: purpose, power, resource flows, and relationships. Applying this lens helps explain why reform is hard, and where action is most needed.
1. Purpose: What Is Research Assessment Actually For?
Research assessment should answer a deceptively simple question: What do we value in research and in researchers? Much has been said about valuing research quality, open science and societal impact, but current assessment systems still prioritise productivity and prestige – implicitly or otherwise. As long as high citation counts and elite journal publications remain the system’s de facto goals, there is a risk that new assessment frameworks will be co-opted to serve those old ends.
Reform requires redefining the system’s purpose. Institutions and funders need to be explicit (and consistent) about the goals that assessment is meant to serve, and about which behaviours should be discouraged as well as rewarded. For example, all Dutch universities and research funders have signed up to the Recognition & Rewards programme which redefines what constitutes merit. In their position paper “Room for Everyone’s Talent”, they commit to broadening recognition beyond papers alone by:
Enabling more diverse career paths
Recognising team performance alongside individual achievements
Emphasising quality of work over sheer number of publications
Encouraging all aspects of open science; and
Cultivating high-quality academic leadership
To put these principles into practice, Dutch institutions are training their assessment committees, putting an increased emphasis on collaboration, and no longer asking for journal-level metrics in evaluations.
Similarly, the University of Cambridge is working to improve research culture through changes in hiring and career development, suggesting an institution aligning its practices with a new purpose. Its Action Research on Research Culture (ARRC) project, for instance, pilots narrative CVs in postdoc recruitment to highlight a broader range of contributions. By explicitly committing to value research for its substance and societal contribution rather than the prestige of the institution where it happened to take place, Cambridge is clarifying what good research assessment means in practice.
2. Power: Who Gets to Decide?
Research assessment systems are shaped by those with the authority to make hiring, promotion, and funding decisions. Today, many of these decision-makers are senior researchers who have succeeded under the current system. Even when they are sympathetic to reform, strong incentives (and time pressures) encourage them to fall back on familiar proxies for merit. If reform is to stick, power dynamics need to shift. That implies giving a greater voice to emerging research leaders and early-career researchers, setting clearer mandates for evaluation panels, and sending stronger signals from the top about what will not be accepted in assessments.
We can see hints of this power shift in both bottom-up and top-down efforts. On one side, emerging research leaders themselves are pushing for change. For example, Dr. Anne Gärtner at TU Dresden leads a grassroots “Responsible Research Assessment” initiative designed to reform how faculty hiring and promotion work. Motivated by the conviction that universities should move from focusing on quantity to emphasising research quality, the initiative is developing new qualitative indicators to support that shift. Gärtner, who won a 2023 Einstein Foundation Award for this work, exemplifies how emerging scientists can become visible champions for changing who sets the standards.
At the same time, major funders are also adjusting the power levers by changing the rules for committees. Germany’s principal science funder, the German Research Foundation (DFG), has explicitly embraced responsible assessment reforms. It now mandates a narrative CV format for grant applications, and explicitly forbids the use of Journal Impact Factor (JIF) or h-index in evaluation. In adopting these policies, DFG is effectively constraining what funding panels can base their judgments on. Notably, DFG emphasises that meaningful reform must be owned by the scientific community itself rather than imposed in a purely top-down way. This changed the balance of power: authority is being exercised to set new norms, while also inviting the community to take responsibility for upholding them.
3. Resource Flows: What Is Actually Rewarded?
Systems change when money, time, and status start flowing differently. At present, resources in academia – research funding, career advancement, awards – still overwhelmingly flow to those who excel by the established metrics. Many early and mid-career researchers are acutely aware of this reality. They might endorse research assessment reform in principle, yet still feel pressure to optimise to the established metrics to secure grants or jobs. As long as career progression is perceived to depend on these metrics, championing new approaches will feel risky at the individual level.
Real reform requires that resource flows are redirected to align with the new values. In practice, this means restructuring grant processes, promotion criteria, and recognition systems to reward a wider range of contributions. Funders, in particular, have leverage: what they choose to fund, and how they make funding decisions, sends a powerful signal to the whole system. An exemplar here is Denmark’s Villum Fonden, which has made responsible assessment principles core to how it funds research. Along with its sister foundation (Velux), Villum has developed a five-year action plan under CoARA to evaluate and refine its review practices annually. It has even experimented with anonymous applications in some grant programmes to reduce bias and encourage high-risk, high-impact proposals.
Most importantly, Villum has changed what it asks for and rewards in applications. It formally prohibits applicants from listing JIF or h-index, uses structured narrative CVs that highlight an applicant’s achievements and ambitions, and looks for qualities like originality, interdisciplinary collaboration, and the ability to build a strong research team. By embedding these criteria into its major funding streams (not just special pilot programmes), Villum is sending a clear message that pursuing bold science and nurturing a good research culture will be materially rewarded. When researchers see that grants and promotions truly do favour quality of ideas, rigour, openness, mentorship, and other diverse contributions, the incentive to rely on old metrics diminishes. Redirecting resources in this way is one of the most concrete ways of making reform real for individuals on the ground.
With that said, one resource that the new approach proposed by CoARA may not be recognising sufficiently is time. Qualitative evaluation, assessing a wider range of contributions, and writing narrative CVs are all activities that demand more time from actors in the research system, many of whom already feel time-poor.
4. Relationships: Who Is Acting Together?
Universities, funders, and researchers themselves all operate within an interconnected ecosystem and no one can reform research assessment alone. If each acts in isolation, change will be slow and uneven – for instance, Dutch institutions that have stopped using JIF for promotion decisions might see their researchers disadvantaged when they compete for grants from international funders who haven’t changed. Instead, shared frameworks, common timelines, and visible coalitions are needed to make change easier to imagine and harder to ignore, by assuring stakeholders that they won’t be acting alone. Initiatives like DORA and CoARA are important steps in this direction, creating forums for agreement on principles. The real test of success, however, is whether collaboration extends beyond high-level endorsement into joint implementation and mutual accountability.
Happily, collective efforts are emerging beyond the Netherlands – for instance, a Latin American network (FOLEC) is aligning regional principles, and dozens of institutions in Europe are collaborating through CoARA chapters. The common thread is building relationships of trust and collaboration so that reform-minded leaders reinforce each other. When universities and funders publicly stand shoulder to shoulder, it makes it much easier for each member of the coalition to push changes internally.
An Amplifier for Research Assessment Reform?
If research assessment is to change in practice, we must treat it for what it is: but a resilient system that will only shift when enough of its moving parts are realigned. So how can this realignment be accelerated?
One idea comes from System Shift’s concept of an “amplifier,” a model we originally developed in the context of social innovation and the future of work. Where startup accelerators function like funnels, selecting a few promising projects and helping them scale up (while many others fall away), an amplifier is designed to strengthen an entire field. Its goal is to increase the traction of existing initiatives by connecting them to each other – unlocking what we call “collaborative advantage” – and to catalyse new, emergent activity. Meanwhile, by engaging the major players in the field, it becomes possible to tackle the system barriers initiatives come up against, such as the additional time burden created by moving to new assessment practices.
An amplifier for research assessment reform would connect the many efforts already underway and help them reinforce one another. For example, it could create a space for funders like Villum and DFG to share what they’ve learned about qualitative review, while also helping universities swap lessons on implementing CoARA principles. It could shine a light on what’s working – whether it’s a new hiring policy that reduced bias, or a funding scheme that incentivised innovative research – and thus make it harder for traditionalists to dismiss the movement as impractical. It could support emerging leaders as visible champions (imagine amplifying the impact of people like Anne Gärtner who are already driving change), and convene unlikely allies across disciplines and regions, generating exciting new initiatives.
By building connectivity and momentum across the 4 Keys of purpose, power, resources, and relationships, such an amplifier could accelerate the transition from scattered experiments to systemic change. It would ensure that the push for research assessment reform is not a whisper in isolated departments, but a chorus across the academic world, too loud and unified to be ignored. If it’s an idea you might be interested in supporting, we would love to hear from you.
Acknowledgements
This piece was inspired by the Villum Fonden whitepaper “Emerging Research Leaders and Research Assessment Reform”, and we thank its authors, Dr. Steven Wooding and Sam Gilbert, for helpful comments and discussion during its development.