• Scalene
  • Posts
  • Scalene 54: ScholarPeer / 2027 / non-binary

Scalene 54: ScholarPeer / 2027 / non-binary

Humans | AI | Peer review. The triangle is changing.

There is no intro this week. Ceci n’est pas un pipe. Onwards.

15th February 2026

1//
When AI reviews science: Can we trust the referee?

The Innovation Informatics - 10 Feb 2026 - 14 min read

This paper provides a security- and reliability-centered analysis of AI peer review. We map attacks across the review lifecycle—training and data retrieval, desk review, deep review, rebuttal, and system-level. We instantiate this taxonomy with four treatment-control probes on a stratified set of ICLR 2025 submissions, using two advanced LLM-based referees to isolate the causal effects of prestige framing, assertion strength, rebuttal sycophancy, and contextual poisoning on review scores. Together, this taxonomy and experimental audit provide an evidence-based baseline for assessing and tracking the reliability of AI peer review and highlight concrete failure points to guide targeted, testable mitigations.

CL: I was hoping for much more from this paper given its premise and the fact that it systematically analyses the various ways in which LLMs can help improve papers (including my favourite, AI-assisted self review) - but some of the constructs around prestige engineering are pretty convoluted and not reflective of likely real world use. It is worth reading though to get a holistic sense of what AI is doing to peer review and (to a lesser extent) manuscript preparation. Never seen this covered in a single paper before.

2//
ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review

arXiv.org - 30 Jan 2026 - 80 min read

This was exciting - a paper from a team of engineers at Google addressing a long-standing problem in AI and peer review. LLMs can - with increasing comprehensiveness - analyse a manuscript as a standalone object, but often fail to place it in the context of all related work which has preceded it. Here, a multi-agent approach avoids the "parametric vacuum" problem, meaning the live contextual knowledge a human expert brings to review can be replicated. ScholarPeer does this through three specialist agents: a historian that constructs a chronological domain narrative; a baseline scout that adversarially identifies missing comparisons; and a multi-aspect Q&A engine that probes novelty and technical soundness against web-scale retrieval.
“ScholarPeer achieves significant win-rates against state-of-the-art approaches in side-by-side evaluations and reduces the gap to human-level diversity.”

Claude gives some decent feedback on how this could be improved, but I think it’s a great piece of work on its own and a great basis for future work.

3//
Peer Review 2027: Scenarios for Academic Publishing in the Age of AI

OSF Preprint - 26 Jan 2026 - 12 min read

A refreshing view on how peer review will evolve in the social sciences from a group of journal editors looking ahead to just 10 months down the line. A sensible time frame, one feels, in this age of hyper speed development of AI tools. The authors propose 4 scenarios, "Let it Rip," "Shut It Down," "Only Editors and Reviewers Get To Use AI," and "Increase AI Production, Increase Human Evaluation" - with the last being the most likely to occur.

It’s unusual compared to some more wishy-washy analyses of the future in that it suggests metadata we can track and use. In their words; “The academic publishing system has operated for decades with remarkably little empirical self-examination. If we are serious about adapting peer review to the age of AI, we must also be serious about building the informational infrastructure to monitor that adaptation.”

To that end they propose AI disclosure checkboxes, systematic researcher surveys, qualitative reports of the experiences by reviewers and editors, institutional experiments in peer review, and quantitative streams of data about submission, desk reject, and acceptance rates. Nothing scary, and most have been tried before, but together they will allow us to adapt peer review for the AI age.

HT: Nisha Doshi for the link. Thx.

4//
We need to move beyond the accept/reject binary in peer review

LSE Blog - 02 Feb 2026 - 7 min read

George Currie and Damian Pattinson make a strong case for the end point of peer review (accept or reject decisions) as not being helpful in communicating the validity of science.

The current accept/reject system creates a false binary of certainty. Journal prestige and publication bias distort how research is judged. Many studies vary in reliability and need ongoing scrutiny. Open, public review would make evaluations transparent and iterative. Transparent review would preserve critique, reduce duplication, and better reflect science’s provisional nature.

5//
Detecting AI-Generated Content in Academic Peer Reviews

arXiv.org - 30 Jan 2026 - 12 min read

This study examines the temporal emergence of AI-generated content in peer reviews by applying a detection model trained on historical reviews to later review cycles at International Conference on Learning Representations (ICLR) and Nature Communications (NC). We observe minimal detection of AI-generated content before 2022, followed by a substantial increase through 2025, with approximately 20% of ICLR reviews and 12% of Nature Communications reviews classified as AI-generated in 2025. The most pronounced growth of AI-generated reviews in NC occurs between the third and fourth quarter of 2024.

Nice to see a real journal in among the machine learning conferences that most peer review research seems to rely on. The overall percentages may seem low or high to you, depending on your viewpoint, but it’s worth looking at the Claude review to get a sense for how these numbers could be made more reliable. What’s not in question for me is that peer review reports are increasingly being ‘assisted’ by AI tools, whether the reviewers declare that or not.

And finally…

The Journal for AI Generated Papers: https://jaigp.org (interesting that these will be peer reviewed from next month).

Wikipedia:Signs of AI Writing: who better than Wikipedia editors to show us where text has been generated by AI? A real eye-opening read this one.

AIR: AI in Research - A framework for transparent and responsible AI use mapped to the research process.

And finally, finally, two surveys that I’d like to bring to your attention:
- Participate in the STM Study on Trust and Authority
- AI & Me - how people in publishing use AI

Let’s chat

I’ll be at the R2R conference in London, 24-25 Feb, but always happy to chat, email, meet - just reply to this email.

Curated by me, Chris Leonard.
If you want to get in touch, please simply reply to this email.