- Scalene
- Posts
- Scalene 27: Collaboration / Errors / Doom
Scalene 27: Collaboration / Errors / Doom
Humans | AI | Peer review. The triangle is changing.
A slightly shorter update this week due to family commitments and the fact that there have been relatively few noteworthy updates this week that came across my radar. Let’s go!
19 January 2025
// 1
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
arXiv - 10 Jan 2025 - 23 min read
Recent advancements in Large Language Models (LLMs) have transformed artificial intelligence (AI), enabling them to perform sophisticated tasks such as creative writing, reasoning, and decision- making, arguably comparable to human level. While these models have shown remarkable capabilities individually, they still suffer from intrinsic limitations such as hallucination, auto- regressive nature (e.g., incapable of slow-thinking), and scaling laws. To address these challenges, agentic AI leverages LLMs as the brain, or the orchestrator, integrating them with external tools and agenda such as planning, enabling LLM-based agents to take actions, solve complex problems, and learn and interact with external environments. Furthermore, researchers are increasingly exploring horizontal scaling — leveraging multiple LLM-based agents to work together collaboratively towards collective intelligence. This approach aligns with ongoing research in Multi-Agent Systems (MASs) and collaborative AI, which focus on enabling groups of intelligent agents to coordinate, share knowledge, and solve problems collectively. The convergence of these fields has given rise to LLM-based MASs, which harness the collective intelligence of multiple LLMs to tackle complex, multi-step challenges.
CL: This paper mentions academic manuscript writing assistance as a use case, but peer review is another obvious one. I love how the authors have compared their review to others via a table of strengths and weaknesses(!) but this is a good overview of using many separate tools with differing strengths to achieve a goal. Team-based AI if you will.
https://arxiv.org/pdf/2501.06322
// 2
ERROR Review
ERROR - 2 min read
ERROR is a comprehensive program to systematically detect and report errors in scientific publications, modelled after bug bounty programs in the technology industry. Investigators are paid for discovering errors in the scientific literature: The more severe the error, the larger the payout. In ERROR, we leverage, survey, document, and increase accessibility to error detection tools. Our goal is to foster a culture that is open to the possibility of error in science to embrace a new discourse norm of constructive criticism.
CL: Not sure how this has avoided my attention, but what a fantastic resource from Malte Elson and Ruben Arslan at the University of Bern. See completed reviews and make sure to read the FAQ about ‘what constitutes an error’
https://error.reviews
https://error.reviews/faq_general/
// 3
Can AI create high-quality, publishable research articles?
Research Information - 16 Jan 2025 - 5 min read
Something peer review managers need to be aware of, the increasing capability of LLMs to create something that looks reviewable (and is publishable? - well see). Keep your eyes on this link to see what transpired from a 3-day workshop in Mannheim with multiple stakeholders, and read all about the much-needed experiment here:
https://www.researchinformation.info/analysis-opinion/can-ai-create-high-quality-publishable-research-articles/
// 4
Tomatoes roaming the fields and canaries in the coalmine: another embarrassing paper for MDPI
BishopBlog - 18 Jan 2025 -7 min read
Another gobbledegook paper that passed peer review (3 of them, no less) and is polluting the published academic corpus. Unfortunately this is not newsworthy in itself anymore, but Dorothy makes a good point that publishers should heed:
The point about a paper like this is that it is so blatantly bad that it cannot have been through any kind of serious editorial scrutiny or peer review. It acts as a canary in the coalmine: if gobbledegook is published in your journal, it's an indicator that you need to look very carefully at your editorial processes, and act immediately to remove editors who let this stuff in.
And finally…
No link list this week, instead a gentle reminder that the ubiquitous PDF is not a static file format. In this post, read about how a researcher got the game Doom running in a PDF! Almost as impressive a getting it to run on a pregnancy testing stick.
Read more here: https://www.theregister.com/2025/01/14/doom_delivered_in_a_pdf/
Let's do coffee!
- London - 30 Jan (free for meetings)
- Researcher 2 Reader conference, London - Feb 25-26
- London Book Fair, London(!) - Mar 11-13
- ALPSP UP Redux, Oxford - April 3-4 [I’m giving the keynote speech on the 3rd]
Let me know if you’re at any of these and we can chat all things Scalene-related.
Free consultation calls
Many of you may know I work for Cactus Communications in my day job, and one of my responsibilities there is to help publishers speed up their peer review processes. Usually this is in the form of 100% human peer review, delivered in 7 days. However, we are keen to experiment further with subtle AI assistance. If you want to chat about how to bring review times down with either a 100% human service, or you’re interested in experimenting with how AI can assist, let’s talk: https://calendly.com/chrisle1972/chris-leonard-cactus
Curated by me, Chris Leonard.
If you want to get in touch, please simply reply to this email.