Scalene
Posts
Scalene 3: Singularities / Double blind cracked / PrivateAI

Scalene 3: Singularities / Double blind cracked / PrivateAI

Chris Leonard
June 02, 2024

Humans | AI | Peer review. The triangle is changing.

Hello and welcome to issue 3 - especially if you’re a new subscriber. I set this newsletter up with the author’s best interests at heart (no one wants to wait 6 months for a rejection or superficial review) - but clearly that can only happen when publishers are willing to experiment and gradually adopt new ways of working.
Judging by the posts coming out SSP this week, it is clear that–for now–there is great appetite for human expertise to be augmented by judicious and sensible application of AI tools - and that this represents a win for both publishers and authors.
I’ll report more on SSP when the dust has settled next week. For now, there is much to catch up on. [CL]

2nd June 2024

// 1
Four Singularities for Research
Just as I pressed ‘send’ on last week’s newsletter, Ethan Mollick published this incredibly insightful overview into AI in research and education. The whole thing is worth reading in full, but this snippet really caught my eye:

And yet, we may not want to dismiss the idea of AI helping with peer review. Recent experiments suggest AI peer reviews tend to be surprisingly good, with 82.4% of scientists finding AI peer reviews more useful than at least some of the human reviews they received from on a paper, and other work suggests AI is reasonably good at spotting errors, though not as good as humans, yet. Regardless of how good AI gets, the scientific publishing system was not made to support AI writers writing to AI reviews for AI opinions for papers later summarized by AI. The system is going to break.

https://www.oneusefulthing.org/p/four-singularities-for-research

CL - Ethan has his finger on the pulse here. I’m consistently amazed about the quantity and quality of stuff he posts. if you don’t follow him on LinkedIn, start today.

// 2
Eliciting Informative Text Evaluations with Large Language Models
This is gold.

In summary, our research introduces a pioneering framework for eliciting high- quality textual judgment. To the best of our knowledge, our work is the first to design peer prediction mechanisms for eliciting high-quality textual reports. We propose two mechanisms, GPPM and GSPPM, which utilize the LLM-derived prediction, two implementations for estimating the LLM-derived prediction, and an evaluation workflow. The use of LLM prediction could extend to other peer prediction mechanisms, given that prediction is the foundation of most peer prediction mechanisms. Our empirical results demonstrate the potential of the GPPM and GSPPM to motivate quality human-written reviews over LLM-generated reviews.

https://arxiv.org/abs/2405.15077v2

CL - As an aside, it is notable that most experiments with AI and peer review are coming from the computer science conference side of the fence. Would be good to see work with more full-length manuscripts in diverse fields as a part of this conversation.

// 3
AI can crack double blind peer review – should we still use it?
You can get the gist of this from the title and the image below, but it’s the openness of the author’s approach that is commendable (Yann Le Cun would approve!)

As peer-review is such a fundamental pillar of science, we hope that this study encourages the research community to further explore how AI is changing peer-review itself. We have open-sourced our codebase (https://github.com/uzh-rpg/authorship_attribution) in the hope that it serves as a starting point for scholars to pick-up our work and build on top of it. Authorship attribution and plagiarism detection are vital to ensure the continued integrity and trustworthiness of academic publishing, and enhancing it will be beneficial to the entire scientific community.

https://blogs.lse.ac.uk/impactofsocialsciences/2023/08/08/ai-can-crack-double-blind-peer-review-should-we-still-use-it/

// 4
Emerald’s stance on AI in peer review
A fairly typical policy statement around AI tools being used in peer review, but interestingly invoking COPE guidelines around article authorship. Peer review reports are seemingly considered similarly to research articles:
https://www.emeraldgrouppublishing.com/publish-with-us/ethics-integrity/research-publishing-ethics#ai

CL - There seems to be no COPE position statements specifically on AI and peer review, which could be a useful addition to their offerings.

// 5
One to keep an eye on
The ‘upstarts’ of the decentralised science world have fewer hang-ups about reinventing how science is evaluated and communicated - and actively embrace opportunities that others may be hesitant to embrace.
PrivateAI.com seem to be linking knowledge graphs to LLMs to create a peer review service. This could solve one the main limitations of current LLMs that they are good-to-great at evaluating the manuscript as a single entity, but struggle to place it into context of previous works. Follow their x feed for updates on this:
https://x.com/privateAIcom

And finally…
Last week I mentioned a news piece in Semafor on AI in peer review, but failed to provide the link to it. Here it is (sorry!):
https://www.semafor.com/article/05/08/2024/researchers-warned-against-using-ai-to-peer-review-academic-papers

More fun next Sunday.
If you want to get in touch with me, please simply reply to this email.