- Scalene
- Posts
- Scalene 43: AutoRev / ReviewIQ / vibes
Scalene 43: AutoRev / ReviewIQ / vibes

Humans | AI | Peer review. The triangle is changing.
The ability to detect AI-generated papers - and peer reviews - seems to becoming popular at the moment. But I do wonder if that is pulling our focus from a better place. Given we can’t un-invent LLMs, and reviewers and authors are clearly already using them, isn’t it better to provide tools, guidance, and guardrails on how to use these tools ethically and optimally? A slightly amusing story below on how this isn’t being implemented right now.
Also, a quick note to let you know I’m in Chicago at the Peer Review Congress 2-5 September, and then at ALPSP in Manchester the week after. Come and say hi or reply to this email if you want to set up a meeting.
Onwards!
17th August 2025
1//
Ensuring peer review integrity in the era of large language models: A critical stocktaking of challenges, red flags, and recommendations
Eur J Radiol. Art. Intel. - 23 Apr 2025 - 17 min read
As one of the article highlights explains, detecting LLM-shaped reviews remains complex, although there are some tell-tale signs that will warrant further investigation. As a publisher or editor, the takeaways seem to be: set clear AI-in-review policies and disclosures, preserve confidentiality - possibly by providing in-house LLM tools for reviewers to minimize data leakage, and finally - train editors to spot red flags and maintain human oversight. There is also a call to advocate for piloting of new tools to do any of this.

2//
AutoRev: Automatic Peer Review System for Academic Research Papers
arXiv - 20 May 2025 - 34 mins
AutoRev proposes a practical, scalable aid for pre-review triage and guidance. It models a paper as a hierarchical graph (yes!) and trains a GNN to identify salient passages, which are then fed to a fine-tuned LLM to produce structured reviews.
Our novel framework represents an academic document as a graph, enabling the extraction of the most critical passages that contribute significantly to the review. This graph-based approach demonstrates effectiveness for review generation and is potentially adaptable to various downstream tasks, such as question answering, summarization, and document representation. When applied to review generation, our method outperforms SOTA baselines by an average of 58.72% across all evaluation metrics.

Although trained exclusively on the ICLR 2024 dataset, the results are promising and - as ever - the real jewels are in the appendices where you can see methods and prompts used to generate these insights.
3//
Academy of Management AI policy tangle
LinkedIn - 14 Aug 2025 - 2 min read
A strange tale from the world of management journals. Dries Faems highlights how an editor of the Academy of Management Journal (AMJ) has created a management research feedback tool in Claude, which is intended to improve the quality of submissions to the journal, but whose use is outlawed according to the journal’s AI policy. Is this intentional or a mistake? Feature or bug? I’m still not sure. Read the post to make your own mind up:
4//
Crowdhelix’s ReviewIQ
Crowdhelix - 15 Aug 2025 - 4 min read
It is no surprise to see Crowdhelix pushing the envelope in the right way when it comes to AI-assistance, and doubly nice to see it applied to grant applications - something with immediate value to academics, but often overlooked. ReviewIQ is a tool which we can expect to see launched in the coming week (Aug 19th), and seems to be more a writing feedback tool for MSCA Doctoral Network proposals, rather than a review tool per se - but that would seem to be a logical next step if it isn’t already a feature. One to watch this week.
5//
The peer-review crisis: how to fix an overloaded system
Nature - 06 Aug 2025 - 10 min read
This is a great overview of most of the problems and potential fixes to the current peer review system. Although the article only really highlights AI in supporting the mechanics of current workflows, a couple of non-AI points stood out for me. 1) Paying peer reviewers can slash turnaround time to a week without any loss in the quality of reviews, and 2) Distributed review for grants is faster and dilutes the gatekeeping aspect of senior academics. Not all evolution needs to involve AI!
And finally…
Let’s Measure Information Step-by-Step: LLM-Based Evaluation Beyond Vibes - I’m highlighting this paper here now, but I’m still evaluating it as I think its implications for peer review are deep and I will probably highlight it again next time when I’m confident I’ve ‘got’ it all. But take a look now if you’re inclined to do so.
Google Confirms That AI-Generated Content Should Be Human Reviewed https://www.searchenginejournal.com/google-says-ai-generated-content-should-be-human-reviewed/553486/
Reimagining peer review: a case for innovation - https://www.researchinformation.info/viewpoint/reimagining-peer-review-a-case-for-innovation/
One-fifth of computer science papers may include AI content - https://www.science.org/content/article/one-fifth-computer-science-papers-may-include-ai-content
And from the story above, this looks like it could be a fascinating hot mess: https://agents4science.stanford.edu
One year ago: Scalene 11, 04 Aug 2024

Let’s chat
I’ll be at the Peer Review Congress in Chicago in early September, and then ALPSP in Manchester shortly thereafter. Wanna meet?
Curated by me, Chris Leonard.
If you want to get in touch, please simply reply to this email.