• Scalene
  • Posts
  • Scalene 35: Persistent prompting / reviewer feedback / FDA

Scalene 35: Persistent prompting / reviewer feedback / FDA

Humans | AI | Peer review. The triangle is changing.

Hey! It’s been a while huh? That’s for a few reasons. First of all, I’ve been busy in my day job with a few forward-looking publishers who are trialing a hybrid human-AI peer review workflow which is incredibly exciting for all concerned. Secondly, it feels like we’re getting close to a point where the rate of change of peer review processes is not constrained by technology, but by human adoption of that technology - the triangle is getting more scalene, as it were. And finally, as I’ve alluded to before, there seems to be lots of exciting developments in the realm of AI-powered science (before a manuscript is written) - and I’m very easily distracted.

11th May 2025

1//
AI-Driven Scholarly Peer Review via Persistent Workflow Prompting, Meta-Prompting, and Meta-Reasoning

arXiv.org - 06 May 2025 - 28 min read

I love this concept of persistent workflow prompting to get the best out of current reasoning LLMs by guiding them through a typical peer review process.. In particular the ability to create different reviewer personas to get a plurality of views on a paper looks like it could go some way to replicate current peer review workflows. The supporting information on p22 (and onward) is a gold mine if you want to replicate this approach.

Submitted once at the start of a session, this PWP prompt equips the LLM with persistent workflows triggered by subsequent queries, guiding modern reasoning LLMs through systematic, multimodal evaluations. Demonstrations show the PWP-guided LLM identifying major methodological flaws in a test case while mitigating LLM input bias and performing complex tasks, including distinguishing claims from evidence, integrating text/photo/figure analysis to infer parameters, executing quantitative feasibility checks, comparing estimates against claims, and assessing a priori plausibility.

2//
Can LLM feedback enhance review quality? A randomized study of 20K reviews at ICLR 2025

arXiv.org - 13 April 2025 - 41 min read

Conferences, as well as journals, are bearing a heavy burden of peer review due to increasing submission numbers. While you may still have the luxury of timely human reviews, unfortunately they are not always helpful. This paper has qualitative data for over 12,000 suggested alterations to conference submissions, made by LLMs to reviewers for vague comments, content misunderstandings, and unprofessional remarks in their reviews. Reviewers who updated their reviews (and it was a disappointingly small 27%) returned longer and more informative reviews - as assessed by blinded human researchers. Engagement during subsequent revisions also increased.

3//
A triumvirate of stories on a similar theme

Various - April/May 2025

What happens when you think LLMs have generated a report or a paper and it’s obvious to you? I pulled these three stories together as they all appeared at the same time with similar concerns.

Even at first glance the comments I received on my manuscript in January this year seemed odd.

First, the tone was far too uniform and generic. There was also an unexpected lack of nuance, depth or personality. And the reviewer had provided no page or line numbers and no specific examples of what needed to be improved to guide my revisions.

For example, they suggested I “remove redundant explanations”. However, they didn’t indicate which explanations were redundant, or even where they occurred in the manuscript.

They also suggested I order my reference list in a bizarre manner which disregarded the journal requirements and followed no format that I have seen replicated in a scientific journal. They provided comments pertaining to subheadings that didn’t exist.

And although the journal required no “discussion” section, the peer reviewer had provided the following suggestion to improve my non-existent discussion: “Addressing future directions for further refinement of [the content of the paper] would enhance the paper’s forward-looking perspective”.

The interdisciplinary call to action proposed in aims at restoring trust in peer review precisely by acknowledging its human imperfections. In that spirit, perhaps part of the solution is to candidly acknowledge, too, that the flaws in an inherently social pursuit are not fixed by outsourcing fundamentally human decisions to the machines, which reproduce human error and bias in opaque ways.

Realistically, automation of some reviewing tasks via AI is a certainty going forward – but research integrity need not suffer in the name of quick fixes. To adopt an optimistic view, the AI revolution could be a catalyst for radically rethinking what, in its most entrenched forms, has ostensibly become a broken and inequitable system of knowledge gatekeeping.

This creates a new and tedious burden on physicians — proofreading content that was neither written nor dictated by the user is difficult to do well. Minimal research has been published to understand the net benefit on efficiency and cognitive burden when physicians’ efforts are shifted from generating new content to reviewing content generated by LLMs. Furthermore, the use of LLMs increases physicians’ accountability for something they did not write. The liability of missing a high-risk error in LLM output is a novel factor for consideration and not well understood.

4//
SciPinion Survey on Peer Review

SciPinion - 27 Apr 2025 - 9 min read

I’ve seen very little discussion on this survey on peer review, which is a shame as it holds some great quantitative (n=200+) and qualitative data on academics’ opinions on the current state of peer review. 6 questions, each with comments from the people surveyed.

TL;DR? Current peer review system is unsustainable, reviewers should be paid, people have become more critical of peer review, but - it is good at exposing methodological flaws. AI should be used in a supporting role. It’s worth spending a few minutes reading the comments too.

5//
FDA aggressively rolls out AI-assisted review

US FDA - 08 May 2025 - 3 min read

I’m generally an optimist when it comes to how AI can speed up and un-bias peer review, but I also push the boundaries often enough to know it isn’t there yet without expert human oversight. I was therefore intrigued to see the same learnings in this US FDA trial where review tasks were reduced from days to minutes. This has prompted an aggressive roll out of these same tools/workflows for all FDA centres by June 30th.

Scant details right now, but I’ll be keeping an eye on this one for sure:

And finally…

When I get distracted and don’t send this for a few weeks, there are inevitably a lot of interesting things to cover I didn’t get round to - not all strictly related to peer review. Here they are link form for you to snack on:

Let’s chat
Many of you may know I work for Cactus Communications in my day job, and one of my responsibilities there is to help publishers speed up their peer review processes. Usually this is in the form of 100% human peer review, delivered in 7 days. However, we are now offering a secure hybrid human/AI service in just 5 days. If you want to chat about how to bring review times down with either a 100% human service, or you’re interested in experimenting with how AI can assist, let’s talk: https://calendly.com/chrisle1972/chris-leonard-cactus

Curated by me, Chris Leonard.
If you want to get in touch, please simply reply to this email.