• Scalene
  • Posts
  • Scalene 9: Image detection / xFakeSci / Paying reviewers

Scalene 9: Image detection / xFakeSci / Paying reviewers

Humans | AI | Peer review. The triangle is changing.

Peer review, and AI assistance in delivering the same, is not just restricted to text. Fake image generation has garnered a lot of attention recently from ‘Dicky Mouse’ to forearms with three bones. But actually most of the malpractice is less amusing and harder to detect, so it was heartening to read research on identifying fake Western blots in the literature, and a radical(!) idea to improve uptake of reviewer invitations.

21st July 2024

// 1
AI Detectors are Poor Western Blot Classifiers: A Study of Accuracy and Predictive Values
arXiv.org - 14 July 2024 - 23 min read

This study evaluates the efficacy of three free web-based AI detectors in identifying AI-generated images of Western blots, which is a very common technique in biology. We tested these detectors on a collection of artificial Western blot images (n=48) that were created using ChatGPT 4 DALLE 3 and on authentic Western blots (n=48) that were sampled from articles published within four biology journals in 2015; this was before the rise of generative AI based on large language models. The results reveal that the sensitivity (0.9583 for Is It AI, 0.1875 for Hive Moderation, and 0.7083 for Illuminarty) and specificity (0.5417 for Is It AI, 0.8750 for Hive Moderation, and 0.4167 for Illuminarty) are very different. Positive predictive values (PPV) across various AI prevalence were low, for example reaching 0.1885 for Is It AI, 0.1429 for Hive Moderation, and 0.1189 for Illuminarty at an AI prevalence of 0.1. This highlights the difficulty in confidently determining image authenticity based on the output of a single detector. Reducing the size of Western blots from four to two lanes reduced test sensitivities and increased test specificities but did not markedly affect overall detector accuracies and also only slightly improved the PPV of one detector (Is It AI). These findings strongly argue against the use of free AI detectors to detect fake scientific images, and they demonstrate the urgent need for more robust detection tools that are specifically trained on scientific content such as Western blot images.

https://arxiv.org/abs/2407.10308

CL - Not much to add here, other than if you’re interested in an AI image detector specifically trained on Western blot images, get in touch and I can point you in the right direction to the soon-to-be-released Cactus tool [Disclosure: I am an employee of Cactus, but the plug seems relevant in this instance]

// 2
Detection of ChatGPT fake science with the xFakeSci learning algorithm
Scientific Reports - 14 July 2024 - 30 min read

In this study, we show how articles can be generated using means of prompt engineering for various diseases and conditions. We then show how we tested this premise in two phases and prove its validity. Subsequently, we introduce xFakeSci, a novel learning algorithm, that is capable of distinguishing ChatGPT-generated articles from publications produced by scientists. The algorithm is trained using network models driven from both sources.

CL: Results are an impressive step in the domain-specific detection of text generated by ChatGPT, but I wonder how useful this will be in years to come when GenAI is ubiquitous. Better to have authors check everything and sign off on a paper saying ‘this is work I stand by’. Do we care if it’s generated by ChatGPT if it’s correct?

// 3
Paying reviewers and regulating the number of papers may help fix the peer-review process
F1000 Research - 09 July 2024 - 28 min read

The exponential increase in the number of submissions, further accelerated by generative AI, and the decline in the availability of experts are burdening the peer review process. This has led to high unethical desk rejection rates, a growing appeal for the publication of unreviewed preprints, and a worrying proliferation of predatory journals. The idea of monetarily compensating peer reviewers has been around for many years; maybe, it is time to take it seriously as one way to save the peer review process. Here, I argue that paying reviewers, when done in a fair and transparent way, is a viable solution. Like the case of professional language editors, part-time or full-time professional reviewers, managed by universities or for-profit companies, can be an integral part of modern peer review. Being a professional reviewer could be financially attractive to retired senior researchers and to researchers who enjoy evaluating papers but are not motivated to do so for free. Moreover, not all produced research needs to go through peer review, and thus persuading researchers to limit submissions to their most novel and useful research could also help bring submission volumes to manageable levels. Overall, this paper reckons that the problem is not the peer review process per se but rather its function within an academic ecosystem dominated by an unhealthy culture of ‘publish or perish’. Instead of reforming the peer review process, academia has to look for better science dissemination schemes that promote collaboration over competition, engagement over judgement, and research quality and sustainability over quantity.

CL: I love this paper, and kudos to the author for writing it and revising it appropriately (and openly, thanks to F1000’s fantastic platform). In my opinion, publishers are going to have to embrace paying reviewers at some point. Yes, it’s not straight forward, but when one journal in a field does it, others will have to follow. And it’s not meant to be a reflection of the time spent on the article x hourly rate of your salary - more a nod to the fact that your labour is adding value to the publisher product. A vision of the future, perhaps.
https://doi.org/10.12688/f1000research.148985.2

// 4
Peer review is essential for science. Unfortunately, it’s broken.
Ars Technica - 12 July 2024 - 12 min read

I was blown away by this evisceration of peer review by cosmologist Paul Sutter. He really knows his stuff and is clearly despairing of many current aspects of the peer review process. And it’s hard to disagree when the facts are presented in such a compelling narrative. Makes me want to read his book, which I may treat myself to.
https://arstechnica.com/science/2024/07/peer-review-is-essential-for-science-unfortunately-its-broken/

// 5
The Peer Review Process: Past, Present, and Future
British Journal of Biomedical Science - 17 June 2024 - 15 min read

What, at first glance, appeared to be another ‘history of peer review’ review, actually turned into something much more interesting when it started examining current problems and future solutions, as well the efficacy and reliability of generating review reports.
I’d like to the think the future models of peer review they suggest (open, collaborative, professionalized, and AI-assisted) are not necessarily mutually exclusive:
https://doi.org/10.3389/bjbs.2024.12054

And finally…
Time to move on from football (ahem) and another report that caught my eye this week was on the use of generative AI in journalism. I was amazed in 2017 when a journalist told me most financial news was already written by automated systems in newsrooms. This is market research (N=45) into attitudes to generative AI in news. Just read the thread if you don’t have time for the full report, and then swap ‘peer review’ in for ‘journalism’ and the results seem to resonate:
https://x.com/risj_oxford/status/1813105582062076211

OK. Congratulations to Spain too, I guess. Best team won:

Let's do coffee!
I’m travelling to the following places over the next few weeks. Always happy to meet and discuss anything related to this newsletter. Just reply to this email and we can set something up:

Birmingham: 29th July
Leeds: 8th August
Oxford 9th August
ALPSP (Manchester): 11-13 September

Curated by Chris Leonard.
If you want to get in touch with me, please simply reply to this email.