Scalene
Posts
Scalene 58: Popularity, REM CTX, politics

Scalene 58: Popularity, REM CTX, politics

Chris Leonard
May 03, 2026

Humans | AI | Peer review. The triangle is changing.

Hello again. It’s time to put on our reading hats and peer review gloves for the next instalment of this beautifully unpredictable newsletter. If you’re dressed appropriately, we can commence…

3rd May 2026

1//
Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

arXiv.org - 30 Apr 2026 - 30 min+ read

This is a great overview of current strengths and limitations around AI peer review - all delivered in the framework of four evaluation methods; human-centric evaluation, LLM-based evaluation, reference-based evaluation, and aspect-oriented evaluation. The main text of the review article is relatively short, but the appendices are fantastically detailed and thoughtful - particularly Appendix C and Tables 6 & 7. Not going to say why, you’ll have to go and look. I’ve saved this to savour during a long flight.

https://arxiv.org/html/2604.27924v1

2//
OpenAI’s Prism turns AI into a tough scientific paper reviewer

GadgetBond - 08 Apr 2026 - 5 min read

I’ve not really mentioned OpenAI’s Prism so far as this newsletter is not explicitly interested in how AI helps authors in writing papers, but recently they announced a new module which does interest us: Paper Review, described in this article as “a tough, always‑on reviewer that pokes holes in weak math, sloppy reasoning, and overconfident claims”.

Paper Review is explicitly built to behave like a careful technical reviewer, not a style assistant. When you run a paper through it, the workflow looks for issues in math, derivations, notation, units, and structure, and checks whether the claims in the abstract and conclusion are genuinely supported by the results sections and figures.

There are other tools out there that do similar things (Paperpal, from Cactus for one - note I am a Cactus employee), but the framing of this development - as a way for AI to turn from slop-generator to sophisticated filter - is something developers of similar tools should consider too. Note this is not peer review, but rather author-driven pre-review - something where most AI tools deliver some value.

https://gadgetbond.com/openai-prism-paper-reviewer/

3//
We Call It Peer Review. Sometimes It Is Just Popularity

Substack - 25 Apr 2026 - 04 min read

It’s good to take a step back sometimes and consider some of the problems we face in traditional human peer review. This post, by Nouran Hamza, focuses on grant applications, but the arguments for journal publications are similar. Hamza argues that even data-driven evaluation is biased.

The metrics we use to evaluate research quality, including journal prestige, citation counts, and institutional reputation, were built by and for a specific kind of researcher. One who writes in English as a first language. One who is affiliated with a university in North America or Europe. One who already has a publication record in journals that are themselves ranked by the same system that rewards publishing in them.

This is circular. It is also the current standard.

We do not fund the best ideas. We fund the ideas that look most like the ideas we have already funded. That is not science. That is pattern matching with a grant committee in the middle.

And while AI can help, if it encapsulates the current evaluation mechanisms, it is hardcoding biases that we could intentionally avoid. As Hamza says, it’s time to stop being polite about it.

https://nouranhamza.substack.com/p/we-call-it-peer-review-sometimes

4//
The costs and benefits of research grant funding peer review

F1000 - 16 Apr 2026 - 22 min read

And while we’re on the subject of grant applications, here’s a study detailing exactly how much it currently costs to review them, and who bears the burden of those costs. It’s finding? Total transaction costs averaged about £24,445 per application, or 13% of funding and applicants bear about 89% of those costs in time and effort. It was also noted that underrepresented groups, along with those with the least experience, are more likely to spend more time preparing a grant/fellowship application than other academic colleagues.

Two notes: 1) this is not yet peer-reviewed, and 2) F1000 - your DOIs for this paper aren’t working

https://f1000research.com/articles/15-534/v1

5//
REM-CTX: Automated Peer Review via Reinforcement Learning with Auxiliary Context

arXiv.org - 31 Mar 2026 - 22 min read

Recent research shows how Large Language Models (LLMs) are increasingly integrated across the full scientific workflow, including research ideation, experimentation, review generation, and iterative refinement. With the rapid advancement of LLM capabilities, research on automated peer-review generation has gained increasing attention. These systems range from relatively simple prompting-based approaches that directly generate reviews from manuscript text, to more complex agentic frameworks involving multiple collaborating models, and systems that incorporate external knowledge or multi-modal context. Empirical evaluations suggest that such automated systems can produce feedback comparable to human reviewers in certain aspects and occasionally surpass human reviews in consistency and coverage.

Despite these advances, most automated peer-review generation systems rely primarily on internal model knowledge or textual manuscript content alone. This limitation can lead to inaccurate novelty assessments, insufficient grounding in prior literature, and incomplete discussion of visual elements such as figures. One approach to address these gaps is to use multi-modal LLMs, which can directly process visual inputs and provide feedback on figures. However, current multi-modal models still have incomplete modality coverage, with no support for (citation) graphs or other non-visual data types. An alternative is to augment a text-based LLM with auxiliary contextual information encoded in text form, and train the model to incorporate it into its outputs.

https://arxiv.org/abs/2604.00248v1

And finally…

You can just review things: A digital ethnography of informal peer review - informal post-publication peer review and how it works.

How to do a good peer review? - from the wonderful PaperWizard team.

Peer review at the service of society - I found this fascinating introduction to the history of politics and peer review. Want to read more.

Let’s chat

OK, big news. I’m working on a presentation for SSP in Chula Vista in 3 and bit weeks’ time. It’s going to introduce the concept of personal peer review and I’m excited to share it with a wider audience. If you are attending SSP, please do two things. 1) come to my talk - otherwise there won’t be a ‘wider audience’ and 2) come and talk to me. If you want to set up a meeting about anything to do with this newsletter - reply to this email. Thanks and see you there.

Curated by me, Chris Leonard.
If you want to get in touch, please simply reply to this email.