Scalene
Posts
Scalene 19: Submission fees / funding reviews / eliminating bias

Scalene 19: Submission fees / funding reviews / eliminating bias

Chris Leonard
November 03, 2024

Humans | AI | Peer review. The triangle is changing.

A real mixed bag of stories this week, but with a subtle focus on eliminating bias in AI evaluations. There is also an additional links section as there were too many stories to cover this week, but some too interesting to ignore.

3rd November 2024

// 1
Peer Review is Broken
Substack - 02 Nov 2024 - 9 min read

Sorry to start off on a downer like this, but this is an excellent fresh pair of eyes on the current state of peer review, and a suggestion to fix it: submission fees which go to pay reviewers. Given that over-publishing is identified as one of the problems which Joel aims to fix, the ‘Polluter Pays Principle’ is seen as having some utility in academic publishing too. Here, quality of the submission is factored into the submission fee, meaning the worse polluters pay a heavier fine/fee:

An IIT Delhi submission scoring 6/10: The US base fee of 1000 USD is normalized to India, resulting in an adjusted base fee of 128 USD. Then it is halved due to scoring one point above the 5/10 baseline, resulting in a final fee of 64 USD.
A Stanford submission scoring 3/10: The US base fee of 1000 USD stays constant. Then it is doubled twice for scoring two points below the 5/10 baseline, resulting in a final fee of 4000 USD.

https://joelniklaus.substack.com/p/peer-review-is-broken

CL: I’m intrigued by the concept of not only adapting this base fee for the author’s country, but also the ‘squaring’ of the fee for poor submissions. It’s scary to submit anything in this scenario unless you’re 100% sure it’s as good as it can be. There are numerous

// 2
AI-assisted prescreening of biomedical research proposals: ethical considerations and the pilot case of “la Caixa” Foundation
Data & Policy, CUP- 23 Oct 2024 - 42 min read

It’s not only journals who employ peer review to ascertain the validity and quality of submissions, but funders do it too - often on a larger scale and to tighter deadlines (which is to say, deadlines). Therefore it’s fairly surprising to me that this is one of the few studies which describes the use of AI in assisting human reviewers for grant applications.

It’s a long, but worthwhile read, so I made a funky podcast out of it with NotebookLM and you can listen while you’re doing something else. The work they did on mitigating biases was particularly insightful I thought.

https://doi.org/10.1017/dap.2024.41

// 3
Written by AI, reviewed by AI, and published by AI - the human editor as the ultimate gatekeeper in publication ethics
European Science Editing - 02 Sept 2024 -5 min read

This is a short but insightful piece of the relative merits of Gemini and ChatGPT as a reviewer of manuscripts.

Both Gemini and ChatGPT provided positive, albeit somewhat generic, remarks on the paper’s strengths, with significant overlap between the 2 sets of comments.Interestingly, however, the chatbots had different opinions when it came to considerations for improvement. While Gemini’s review criticized the limited discussions of AI benefits, specificity in recommendations and future research considerations, ChatGPT asked for further empirical data, broader stakeholder perspectives and “to compare the proposed AI research integrity guidelines with existing frameworks or guidelines in relatedfields.” In conclusion, Gemini noted that “This paper provides a valuable analysis of the challenges and opportunities presented by AI in scientific research,” while ChatGPT asserted that “Overall, the paper makes a significant contribution to the literature by addressing a timely and important topic with clarity and depth.” On the whole, Gemini and ChatGPT’s comments appear to be 2 well-structured but separate positive reviews. Notably, however, these chatbots tend to spell out broad notions of strengths and weaknesses of the paper in a rather general and templated manner, but do not really go into details of any of the writings and initiate criticisms from within the lines

CL: I’ve highlighted this section as it is a common reaction to using LLMs in the review process. You need granular prompting, iterations on answers, and human interpretation right now to make this a sensible process. Also, domain experts need to link to previous literature in a more comprehensive way than plain LLMs allow.

https://doi.org/10.3897/ese.2024.e132192

// 4
A Third Transformation?
Ithaka S+R - 30 Oct 2024 -36 min read

In January 2024, Ithaka S+R published “The Second Digital Transformation of Scholarly Publishing” - which shamefully I just got round to reading a few weeks ago. Now, however, there is an update to that based on 12 expert interviews held in the middle of 2024. There is specific mention of disruption to the peer review workflow, but the scope of this report is much broader:

The consensus among the individuals with whom we spoke is that generative AI will enable efficiency gains across the publication process. Writing, reviewing, editing, and discovery will all become easier and faster. Both scholarly publishing and scientific discovery in turn will likely accelerate as a result of AI-enhanced research methods. From that shared premise, two distinct categories of change emerged from our interviews. In the first and most commonly described future, the efficiency gains made publishing function better but did not fundamentally alter its dynamics or purpose. In the second, much hazier scenario, generative AI created a transformative wave that could dwarf the impacts of either the first or second digital transformations.

https://sr.ithaka.org/publications/a-third-transformation/

// 5
Conformity in Large Language Models
arXiv - 16 Oct 2024 - 24 min read

If you want to know if LLMs display similar psychological biases to humans, the bad news is that, yes, they do. This is a problem for AI-assisted review as much as it is for human reviewing. With one caveat, we can mitigate the excesses of bias with smart prompting using Devil’s Advocate and Question Distillation approaches. Here the work describes how ‘conforming to the mean’ was tuned out of LLMs with these interventions.

https://arxiv.org/abs/2410.12428

And finally…

So much extra stuff to share this week, but before I do, I’ve been engrossed in read the Apollo 11 Preliminary Science Report - and now you can too: https://www.nasa.gov/wp-content/uploads/static/history/alsj/a11/as11psr.pdf

One of the reasons this newsletter takes so long to write is that I get sidetracked too easily. I wanted to find a link to the alternative speech that Richard Nixon would have given should the two moon-landing astronauts not be able to come home:
https://www.archives.gov/files/presidential-libraries/events/centennials/nixon/images/exhibit/rn100-6-1-2.pdf

However I got sidetracked with a deep fake of this actually being made by MIT. Very disconcerting: https://moondisaster.org

Anyway, here’s those links I promised:

Obvious artificial intelligence-generated anomalies in published journal articles: A call for enhanced editorial diligence - https://doi.org/10.1002/leap.1626
Wiley Leans into AI. The Community Should Lean with Them - https://scholarlykitchen.sspnet.org/2024/10/31/wileys-josh-jarret-interview-about-impact-of-ai/
Automated Social Science: Language Models as Scientist and Subjects - [video] https://youtu.be/fr_zUWaKDF8?si=sfi0NIxwVFZEb9f6
Enhancing peer review efficiency: A mixed-methods analysis of artificial intelligence-assisted reviewer selection across academic disciplines - https://doi.org/10.1002/leap.1638

Let's do coffee!
I’m in London and free on the morning of November 6th for anyone who wants to meet up.

Curated by Chris Leonard.
If you want to get in touch with me, please simply reply to this email.