ScreenIT: harnessing automated peer-review to support Open Science

[Version anglaise de l’article « ScreenIT : quand l’automatisation du peer review se met au service de la science ouverte »]

Have you heard of ScreenIT, formerly known as the Automated Screening Working Group? Since 2020, this collective of biologists and software engineers has been developing tools to improve the rigour and reproducibility of scientific research. Their project originated in response to the publication avalanche of bioRvix and medRxiv preprints concerning the COVID-19 pandemic. While these papers were a valuable resource for the medical and scientific communities, their volume and speed of production far exceeded the capacity of peer review, resulting in significant delay between preprint publication and their professional validation. In response, ScreenIT developed a pipeline to systematically analyse preprints and flag common problems, sharing the resulting reports on hypothes.is. Within ScreenIT’s key criteria? The presence of transparent, open code and data.

Fast-forward to 2026. Thankfully, the pandemic is over, but the problem of the sheer volume of scientific papers overwhelming the peer review system has only worsened with the explosion of AI tools. For example, the Wiley group reported a 25% increase in paper submissions in the first quarter of 2025 alone. Last October, a Cambridge University Press study revealed that 81% of the 3000 surveyed researchers agreed that ‘the increase in scientific papers has put the peer review system under pressure’. Meanwhile, as many as 10% of cancer-research publications from between 1999-2024 have recently been estimated to be paper mill frauds! To identify fraudulent articles, a newly published tool in BMJ automatically detects characteristic paper-mill patterns (incorrect descriptions of reagents, manipulated or re-used images, plagiarised text, templated layouts, etc.). But to analyse the credibility and reproducibility of legitimate scientific articles, a tool assessing the transparency of the work and data would be more relevant.

In February 2026, ScreenIT members published a study of their current pipeline, within which they compared 11 tools across nine transparency criteria, by applying them to 1500 open-access papers on PubMed Central. To establish a gold standard, human evaluators manually curated 100 papers for each criterion. The authors sought to determine not only the optimum tool for each task, but also whether combining tools would improve performance. The results were mixed: for certain tasks, such as the evaluation of the transparency of participant inclusion or exclusion criteria in clinical studies, the combination of multiple complementary tools produced the best results. For others, such as the identification of software used, SoftCite was the singular winner. Personally, I was most interested in the transparency criterion concerning code accessibility. Contrary to expectations, ODDPub was the most effective tool for this task, outperforming the commercial AI alternative SciScore.

The influence of the original pandemic context remains evident throughout ScreenIT’s tools; many are geared towards the specific characteristics of clinical trials, limiting their applicability to other types of scientific articles. Nevertheless, ScreenIT also provides general best-practice analyses, such as the use of randomisation, single- or double-blinding methods, or study power assessment based on sample size. Yet despite the value of this tool collection, I was unable to find access to the overall pipeline. For personal use, it seems we need to search for each tool individually.

And at the level of communities or companies, will these tools be used? For one journal, the answer is already ‘yes’: Jessica Leight, researcher at the International Food Policy Research Institute and academic editor at PLOS One, recently confirmed the journal’s adoption of ScreenIT in a LinkedIn post. These ScreenIT reports are already available in some of the open-access peer-review reports for PLOS One, marked with a characteristic robot logo (see, for example, the report associated with this article). How this addition may assist authors and editors in the review and publication process remains to be seen.

Taking a step back, what might the future hold for ScreenIT? Beyond PLOSOne, I believe ScreenIT could perfectly fulfil its original purpose: quickly sorting through preprints and identifying those with sufficiently solid and transparent methods to merit consideration, even before peer review. In this way, I could easily see ScreenIT being integrated into the various Rxiv platforms, enabling rapid verification of a set of transparency criteria. I do not think, however, that ScreenIT is a sufficient substitute for human reviewers. More broadly, why not use ScreenIT to assess the transparency of a scientist’s work? Perhaps it could be the first step towards incorporating reproducibility into decisions regarding hiring, promotion, or funding.

Caitlin Martin, postdoctoral researcher at the Institut Pasteur

References:

Weissgerber et al. 2021. Automated screening of COVID-19 preprints: can we help authors to improve transparency and reproducibility? Nature Medicine. 27:6-7. https://doi.org/10.1038/s41591-020-01203-7
Scancar et al. 2026. Machine learning based screening of potential paper mill publications in cancer research: methodological and cross sectional study. BMJ. 392. https://doi.org/10.1136/bmj-2025-087581
Eckmann et al. 2026. Use as directed? A comparison of software tools intended to check rigor and transparency of published work. PLOS One. https://doi.org/10.1371/journal.pone.0342225

Partager :

En savoir plus sur Open science : évolutions, enjeux et pratiques