AI + Reproducibility + Peer Review = A Perfect Storm in Science.

13 min readJan 29, 2024

By Robert Hanna

***

You can also download and read or share a .pdf of the complete text of this essay by scrolling down to the bottom of this post and clicking on the Download tab.

***

AI + Reproducibility + Peer Review = A Perfect Storm in Science

Contemporary science is a juggernaut, but also multiply troubled. In a series of essays, I’ve argued for three critical claims about mainstream 20th and 21st century formal and natural science.

First, although computing machinery is indeed artificial, it can’t ever be intelligent in the sense in which we’re intelligent, no matter how sophisticated it might be or become: therefore, the research program of so-called “Artificial Intelligence,” aka AI, is in fact nothing but a pernicious myth. Why? Computers can’t perform uncomputable functions, but we can (Hanna, 2023a). Computers are Turing machines, hence machines, hence they’re not living organisms. But all and only living organisms of a certain high level of complexity, i.e., animals, have consciousness: indeed consciousness is nothing more and nothing less than a form of life (Hanna and Maiese, 2009). Therefore computers can’t be conscious, whereas we’re conscious, self-conscious, and intelligent but also finite, fallible, and also otherwise thoroughly normatively imperfect animals, i.e., we’re rational “human, all-too-human” minded animals, and computers can never replicate this, not even in principle (Hanna, 2023b). Our intelligence is not only intellectual but also affective or caring-based, but computers cannot have affects or care about anything (Hanna, 2023c). Moreover, according to what I call Babbage’s Principle, computers cannot convert false or otherwise flawed informational inputs into true or in any other way saliently improved outputs, but we can (Hanna, 2023d). Correspondingly, computers are not capable of authentic human creativity, but we are (Hanna, 2023e). Finally, our rationally unjustified and false belief in the myth of AI enables our excessive reliance on and indeed addiction to digital technology (Hanna, 2023f); and, in the specific case of Large Language Models (LLMs), aka chatbots, aka so-called “generative AI,” and so-called “Artificial General Intelligence” (AGI), just as the research program to design and build atomic bombs should have been shut down and banned as soon its destructive potential was revealed in the first atomic bomb test in 1945, so too should the research program in so-called generative AI and/or AGI be immediately shut down and banned (Hanna, 2023g).

Second, contrary to a widespread belief in mainstream 20th and 21st century science and philosophy of science, the reproducibility of empirical scientific studies is neither a necessary nor a sufficient condition of their truth. Why? According to what I call Hanna’s Uncertainty Principle, the more precisely we measure the original set-up conditions of empirical scientific studies, the less we’re able to reproduce its results, and conversely (Hanna, 2023h). Indeed, every empirical scientific study is unique. Therefore, the exact reproducibility of empirical scientific studies is impossible. And even if it were possible, reproducing exactly the same original set-up conditions and results of any empirical scientific study would be no more relevant to the truth of that study than generating multiple exact copies of some written text, like a newspaper, is relevant to the truth of claims made in that text (Hanna, 2023h).

And third, the professional academic system of peer review is wholly counterproductive. Why? In fact, peer review is nothing more and nothing less than a massively effective social-institutional mechanism for (i) straitjacketting scientific creativity into conformity with current scientific orthodoxy and (ii) bottlenecking the dissemination and sharing of the results of new scientific research (Hanna, 2023i).

That all being so, then it would also be highly reasonable to expect that the confluence and conjunction of so-called AI, reproducibility, and peer review would generate a perfect storm in contemporary science. And that’s indeed the case, as Philip Ball argues in “Is AI Leading to a Reproducibility Crisis in Science?”:

Computer scientists Sayash Kapoor and Arvind Narayanan at Princeton University in New Jersey reported earlier this year that the problem of data leakage (when there is insufficient separation between the data used to train an AI system and those used to test it) has caused reproducibility issues in 17 fields that they examined, affecting hundreds of papers…. They argue that naive use of AI is leading to a reproducibility crisis.

Machine learning (ML) and other types of AI are powerful statistical tools that have advanced almost every area of science by picking out patterns in data that are often invisible to human researchers. At the same time, some researchers worry that ill-informed use of AI software is driving a deluge of papers with claims that cannot be replicated, or that are wrong or useless in practical terms.

There has been no systematic estimate of the extent of the problem, but researchers say that, anecdotally, error-strewn AI papers are everywhere. “This is a widespread issue impacting many communities beginning to adopt machine-learning methods,” Kapoor says.

Aeronautical engineer Lorena Barba at George Washington University in Washington DC agrees that few, if any, fields are exempt from the issue. “I’m confident stating that scientific machine learning in the physical sciences is presenting widespread problems,” she says. “And this is not about lots of poor-quality or low-impact papers,” she adds. “I have read many articles in prestigious journals and conferences that compare with weak baselines, exaggerate claims, fail to report full computational costs, completely ignore limitations of the work, or otherwise fail to provide sufficient information, data or code to reproduce the results.”…

As with any powerful new statistical technique, AI systems can make it easy for researchers looking for a particular result to fool themselves. “AI provides a tool that allows researchers to ‘play’ with the data and parameters until the results are aligned with the expectations,” says Shamir.

“The incredible flexibility and tunability of AI, and the lack of rigour in developing these models, provide way too much latitude,” says computer scientist Benjamin Haibe-Kains at the University of Toronto, Canada, whose lab applies computational methods to cancer research….

Reproducibility doesn’t guarantee that the model is giving correct results, but only self-consistent ones, warns computer scientist Joaquin Vanschoren at the Eindhoven University of Technology in the Netherlands. He also points out that “a lot of the really high-impact AI models are created by big companies, who seldom make their codes available, at least immediately.” And, he says, sometimes people are reluctant to release their own code because they don’t think it is ready for public scrutiny.

Although some computer-science conferences require that code be made available to have a peer-reviewed proceedings paper published, this is not yet universal. “The most important conferences are more serious about it, but it’s a mixed bag,” says Vanschoren.

Part of the problem could be that there simply are not enough data available to properly test the models. “If there aren’t enough public data sets, then researchers can’t evaluate their models correctly and end up publishing low-quality results that show great performance,” says Joseph Cohen, a scientist at Amazon AWS Health AI, who also directs the US-based non-profit Institute for Reproducible Research. “This issue is very bad in medical research.”

The pitfalls might be all the more hazardous for generative AI systems such as large language models (LLMs), which can create new data, including text and images, using models derived from their training data. Researchers can use such algorithms to enhance the resolution of images, for instance. But unless they take great care, they could end up introducing artefacts, says Viren Jain, a research scientist at Google in Mountain View, California, who works on developing AI for visualizing and manipulating large data sets….

[Data scientist Gaël] Varoquaux and computer scientist Veronika Cheplygina at the IT University of Copenhagen have argued that current publishing incentives, especially the pressure to generate attention-grabbing headlines, act against the reliability of AI-based findings…. Haibe-Kains adds that authors do not always “play the game in good faith” by complying with data-transparency guidelines, and that journal editors often don’t push back enough against this.

The problem is not so much that editors waive rules about transparency, Haibe-Kains argues, but that editors and reviewers might be “poorly educated on the real versus fictitious obstacles for sharing data, code and so on, so they tend to be content with very shallow, unreasonable justifications [for not sharing such information]”. Indeed, authors might simply not understand what is required of them to ensure the reliability and reproducibility of their work. “It’s hard to be completely transparent if you don’t fully understand what you are doing,” says [Casey] Bennett [at DePaul University in Chicago, Illinois, a specialist in the use of computer methods in health].

In a Nature survey this year that asked more than 1,600 researchers about AI, views on the adequacy of peer review for AI-related journal articles were split. Among the scientists who used AI for their work, one-quarter thought reviews were adequate, one-quarter felt they were not and around half said they didn’t know. (Ball, 2023; see also Van Noorden and Perkel, 2023, esp. under the rubric “Quality of AI Review in Research Papers”)

And there’s another serious problem about so-called AI in science: it’s being used to commit scientific fraud by means of digitally manipulating data images:

The Dana-Farber Cancer Institute, an affiliate of Harvard Medical School, is seeking to retract six scientific studies and correct 31 others that were published by the institute’s top researchers, including its CEO. The researchers are accused of manipulating data images with simple methods, primarily with copy-and-paste in image editing software, such as Adobe Photoshop….

The very simple methods used to manipulate the DFCI data are remarkably common among falsified scientific studies …. Data sleuths have gotten better and better at spotting such lazy manipulations, including copied-and-pasted duplicates that are sometimes rotated and adjusted for size, brightness, and contrast. As Ars recently reported, all journals from the publisher Science now use an AI-powered tool to spot just this kind of image recycling because it is so common. (Mole, 2024)

As the second paragraph quoted just above shows, not only is so-called AI a means of scientific fraud, but also it’s now being employed as the cure of the scientific disease for which it is itself the cause: a vicious circle if ever there was one.

Moreover, beyond so-called AI, reproducibility, and peer review per se, we should also add the important fact that these are embedded in and enabled by the three endemic, serious problems of the commodification, mechanization, and moralization of higher education inside the contemporary professional academy (Hanna, 2024a). By virtue of those problems, contemporary science is not only intimately entangled with the military-industrial complex,[i] and indeed with what I’ve called the military-industrial-digital complex (Hanna, 2023j). but also with what I’ll call the military-industrial-digital-university complex (see also Schmidt, 2000). For convenience, let’s call this MIDUC.[ii]

Now, what is to be done? Here are four proposals.

First, abolish the unconstrained so-called AI research program and replace it with what I’ve called dignitarian neo-luddism with respect to digital technology, which says that

not all digital technology is bad and wrong, but instead all and only the digital technology that harms and oppresses ordinary people (i.e., people other than digital technocrats), by either failing to respect our human dignity sufficiently or by outright violating our human dignity, is bad and wrong, and therefore all and only this bad and wrong digital technology should be rejected but not — except in extreme cases of digital technology whose coercive use is actually violently harming and oppressing ordinary people, for example, digitally-driven weapons or weapons-systems being used for mass destruction or mass murder — destroyed, rather only either simply refused, non-violently dismantled, or radically transformed into its moral opposite. (Hanna, 2023j: pp. 7–8)

Second, abolish the reproducibility requirement and replace it with the two-part requirement of (i) the attitude of epistemic humility towards empirical science, which

is not an all-out or destructive skepticism about empirical science, but instead a measured or constructive skepticism that yields a critical awareness of the proper limits and scope of empirical science (Smith and Smith, 2023),

and (ii) what I’ll call the family resemblance requirement, which says that

as a necessary condition of its truth, any empirical scientific study must be fully consistent and coherent with an appropriately large family of overlapping, individually unique and therefore different, but also in all theoretically relevant ways similar, empirical scientific studies, on the assumption that at least some of the studies in that family are true.

Third, abolish peer review and replace it with what I’ve called the matrix of ideas:

in diametric opposition to the mechanical, constrictive “marketplace of ideas” thought-shaper, I’m proposing instead a thought-shaper I’ll call the matrix of ideas, which captures not only (i) the structured, systematic conception of a grid, but also (ii) the organic, generative conception of a womb. Above all, in the [post-peer-review] environment I’m characterizing as “the matrix of ideas,” where serious scholars are collaborators collectively pursuing goodness, truth, and knowledge, and not competitors individually pursuing professional academic zero-sum bragging-rights and glory, high social status, high salaries, and coercive moralistic power over their so-called “colleagues,” there would be no commodification, mechanization, or moralization, all of which are endemic, significant problems for contemporary higher education inside the professional academy. (Hanna, 2023i: p. 7)

Fourth and finally, detach science from MIDUC to the greatest extent that’s social-institutionally possible.

In my opinion, implementing these four proposals would not only directly and effectively address and reverse the perfect storm in contemporary science that’s being caused by the confluence and conjunction of so-called AI, reproducibility, and peer review, in their intimate entanglement with MIDUC, but also bring mainstream 21st century science smoothly into conformity with a radically alternative approach to the formal-&-natural sciences that I’ve called promethean science (Hanna, 2024b: esp. ch. 15).[iii]

NOTES

[i] This of course riffs on a famous phrase in US President Dwight D. Eisenhower’s “Farewell Address” in 1961:

[The] conjunction of an immense military establishment and a large arms industry is new in the American experience. The total influence — economic, political, even spiritual — is felt in every city, every statehouse, every office of the federal government. We recognize the imperative need for this development. Yet we must not fail to comprehend its grave implications. Our toil, resources and livelihood are all involved; so is the very structure of our society. In the councils of government, we must guard against the acquisition of unwarranted influence, whether sought or unsought, by the military–industrial complex. The potential for the disastrous rise of misplaced power exists, and will persist. We must never let the weight of this combination endanger our liberties or democratic processes. We should take nothing for granted. Only an alert and knowledgeable citizenry can compel the proper meshing of the huge industrial and military machinery of defense with our peaceful methods and goals so that security and liberty may prosper together. (See, e.g., Wikipedia, 2024, [boldfacing] added)

[ii] Pronounced “my duck.”

[iii] I’m grateful to Scott Heftler for thought-provoking conversation on and around the main topics of this essay, to Donald Stanley for calling my attention to (Ball, 2023), and to Joseph Wayne Smith for drawing my attention to (Mole, 2024).

REFERENCES

(Ball, 2023). Ball. P. “Is AI Leading to a Reproducibility Crisis in Science?” Nature 624: 22–25. Available online at URL = <https://www.nature.com/articles/d41586-023-03817-6>.

(Hanna, 2023a). Hanna, R. “How and Why to Perform Uncomputable Functions.” Unpublished MS. Available online at URL = <https://www.academia.edu/87165326/How_and_Why_to_Perform_Uncomputable_Functions_March_2023_version_>.

(Hanna, 2023b). Hanna, R. “The Myth of Artificial Intelligence and Why It Persists.” Unpublished MS. Available online at URL = <https://www.academia.edu/101882789/The_Myth_of_Artificial_Intelligence_and_Why_It_Persists_May_2023_version_>.

(Hanna, 2023c). Hanna, R. “‘It’s a Human Thing. You Wouldn’t Understand.’ Computing Machinery and Affective Intelligence.” Unpublished MS. Available online HERE.

(Hanna, 2023d). Hanna, R. “Babbage-In, Babbage-Out: On Babbage’s Principle.” Unpublished MS. Available online at URL = <https://www.academia.edu/101462742/Babbage_In_Babbage_Out_On_Babbages_Principle_May_2023_version_>.

(Hanna, 2023e). Hanna, R. “Creative Rage Against the Computing Machine: Necessary and Sufficient Conditions for Authentic Human Creativity.” Unpublished MS. Available online HERE.

(Hanna, 2023f). Hanna, R. “Addicted to Chatbots: ChatGPT as Substance D.” Unpublished MS. Available online at URL = <https://www.academia.edu/103582236/Addicted_to_Chatbots_ChatGPT_as_Substance_D_June_2023_version_>.

(Hanna, 2023g). Hanna, R. “Oppenheimer, Kaczynski, Shelley, Hinton, & Me: Don’t Pause Giant AI Experiments, Ban Them.” Unpublished MS. Available online HERE.

(Hanna, 2023h). Hanna, R. “Empirical Science With Uncertainty But Without Reproducibility.” Unpublished MS. Available online HERE.

(Hanna, 2023i). Hanna, R. “The End of Peer Review and The Matrix of Ideas.” Unpublished MS. Available online at URL = <https://www.academia.edu/109130045/The_End_of_Peer_Review_and_The_Matrix_of_Ideas_November_2023_version_>.

(Hanna, 2024a). Hanna, R. “Higher Education Without Commodification, Mechanization, or Moralization.” Available online HERE.

(Hanna, 2024b). Hanna, R. PROMETHEAN SCIENCE: Mind, Life, The Formal-&-Natural Sciences, and A New Concept of Nature. Self-Published. Available online HERE.

(Hanna and Maiese, 2009). Hanna, R. and Maiese, M., Embodied Minds in Action. Oxford: Oxford Univ. Press. Available online in preview at URL = <https://www.academia.edu/21620839/Embodied_Minds_in_Action>.

(Mole, 2024). Mole, B. “Top Harvard Cancer Researchers Accused of Scientific Fraud; 37 Studies Affected.” Ars Technica. 22 January. Available online at URL = <https://arstechnica.com/science/2024/01/top-harvard-cancer-researchers-accused-of-scientific-fraud-37-studies-affected/>.

(Schmidt, 2000). Schmidt, J. Disciplined Minds: A Critical Look at Salaried Professionals and the Soul-Battering System That Shapes Their Lives. New York: Rowman & Littlefield.

(Smith and Smith, 2023). Smith, J.W. and Smith, S.J. “From Scientific Reproducibility To Epistemic Humility.” Against Professional Philosophy. 3 December. Available online at URL = <https://againstprofphil.org/2023/12/03/from-scientific-reproducibility-to-epistemic-humility/>.

(Van Noorden and Perkel, 2023). Van Noorden, R. and Perkel, J.M. “AI and Science: What 1,600 Researchers Think.” Nature 621: 672–675. Available online at URL = <https://www.nature.com/articles/d41586-023-02980-0>.

(Wikipedia, 2024). Wikipedia. “Military-Industrial Complex.” Available online at URL = <https://en.wikipedia.org/wiki/Military%E2%80%93industrial_complex>.

Download