~ 17 min read

DOI: https://doi.org/10.36850/dc83cd30-11ad

Cultures of Trial and Error: Identifying and Overcoming Barriers in Science Correction

ByMelpomeni (Melina) AntonakakiOrcID, Candida F. Sánchez BurmesterOrcID & Mady BarbeitasOrcID

A JOTE x NanoBubbles, peer reviewed, blog post series, with guest editors Melpomeni (Melina) Antonakaki, Candida F. Sánchez Burmester, Mady Barbeitas.

The sentiment that "the truth will prevail" is commonly held in the worlds of science, suggesting that scientific knowledge advances through self-correction, via peer review and replication. However, the arguments and evidence supporting this idea have been continuously debated and updated. This discussion spans academic, industry, and science policy communities, focusing on various normative issues, such as whether corrective measures in science are necessary, what form they should take, whether they should operate on an individual or systemic level, and how to demarcate the "self" in self-correction.

The last decade has witnessed a renewed, practical interest in science correction, often alongside concerns about crises of reproducibility in biomedicine (Mcnutt, 2014), nanotechnology and materials science (Leong et al., 2019), and crises of confidence regarding the experimental methods of social psychology (Pashler & Wagenmakers, 2012) and political science (Janz & Freese, 2019). Furthermore, this has occurred within the context of a larger crisis of higher education (Callard, 2022; Loher & Strasser, 2019), as much of university-based research today depends on short-term contracts, while competitive rankings, research output indicators and audit procedures have been introduced (and resisted) as drivers of academic productivity (Halffman & Radder, 2015; Welsh, 2021).

Today, science correction—whether related to one or a combination of the aforementioned crisis narratives and productivity drives—is debated and practiced across a wide range of fields and forums. Several efforts to correct science exist outside or at the margins of academia. At the same time, specific projects and personalities have received widespread recognition, while events such as the World Conferences on Research Integrity have become powerful institutions in themselves. We refer to these heterogeneous communities of varying degrees of institutionalization, but with explicit commitments to science correction, as 'cultures of trial and error': their participants develop shared standards of practice and communication, just so they can do together the corrective work they set out to do (this is what goes into designing any Trial); at the same time, they cannot alone evaluate the wider significance of corrective work and need other cultures of trial and error, also broader constituencies, to see value in it (this is what goes into identifying any Error).

Several practitioners of cultures of trial and error have reported barriers in the execution and communication of their work, while trying to set the record straight via publication outlets (Gelman, 2013; Lévy, 2022) or challenging questionable practices in science management (see discussions on policy ineffectiveness, harassment and retaliation at Franzen, 2021, p. 137, pp. 173-174; Täuber, 2022). These include rejections or delays in publishing in mainstream journals (as corrective output may not fit existing genres or is considered unsuitable for peer review), dismissive behavior (perceived as fostering mistrust or discord), legal threats (accused of defamation), and lacking support from their home institutions (as such work is not typically covered in contracts). Speaking publicly about the challenges of corrective work is often treated as a threat to the authority of scientific expertise at large, especially when systemic deficits are foregrounded in public discourse, such as the rise in the rate of article retractions (Abritis et al., 2021).

As interdisciplinary researchers and collaborative partners in scientific correction projects, we have observed how barriers tend to intersect, producing an amplified effect for those involved. Who benefits and who suffers at this intersectional predicament? Which lessons can be drawn from the experiences of collectively identifying and developing strategies for overcoming barriers? And how do those lessons enrich the scope and shape of science correction?

As access to formal credentials or supporting infrastructure is often limited at the outset of corrective work, it is not surprising that sharing lived experiences of overcoming barriers and establishing credibility have become valuable community resources. These stories contribute to the strong belief, at least among cultures of trial and error, that efforts to correct the errors and right the wrongs are both just and justified. New agents driving research culture reforms have emerged as a result, including communities of ‘meta-scientists,’ ‘sleuths’ (sometimes called ‘epistemic activists’ or ‘misconduct detectives’), as well as open and/or citizen science initiatives, ‘feminist technoscience laboratories’ and academic labor activists [1]. Through their persistent and creative efforts, particularly over the last fifteen years, emerging cultures of trial and error have reinvigorated longstanding questions about the role of proper scientific conduct in the research outcome.

The upcoming series of twelve blog posts, one published every month (and indicated throughout the text in bold), addresses how practitioners in specific cultures of trial and error experience and deal with barriers to science correction in academia and industry, contributing to an increased understanding of how corrections aim not just for error-free, but better science.

Overview of the series

Efforts to correct science may stem from scientist-driven initiatives and be long-term projects, as the four decades of continuous work for the Society for Scientific Values (hereafter, SSV) in India illustrates. Concerned about a case of unfounded claims and dubious promotion practices in the country, a group of Indian scientists came together in 1981 to promote a “healthy scientific environment” (Jayaraman, 1987, p. 535), founding the SSV in 1986. The SSV has examined cases of misconduct in India. By documenting the investigations on its website, and publishing a biannual newsletter discussing cases and articles, the SSV has curated a rich and open repository. In discussions about science correction, India and other countries in the Global South are often criticized for predatory publishing, while less attention is paid on corrective initiatives in these regions. A more balanced understanding of questionable and corrective practices in those countries requires serious consideration of cultures of trial and error and their national and international networks (Shahare & Roberts, 2020). Melina Antonakaki focuses on a Japanese scientist-driven initiative that aimed for self-correction in molecular biology, discussing what drove, for almost a decade (2006-2014), those researchers active in it, and which factors contributed to its untimely stopping in its tracks.

In other situations, science correction is framed as a problem of better integrating science in society, something that involves scientists and non-scientists. This framing opens up space for politicians, lawyers, government officials, and journalists to shape, to some extent, the standards and available resources for corrective activities. The US Congress hearings held in 1981, following several cases of misconduct, is a milestone in this sense. David Guston sees those hearings as “differ[ent] from the previous inquiries into the political loyalty of individual scientists and the fiscal integrity of the scientific agencies, scrutinizing instead the integrity of the performance of research and the mechanisms of alleged self-regulation of misconduct” (2000, p. 88). Today, established priorities, i.e., integrity and productivity, persist and interact, in public discourse and in scientific practice, with other evaluatory ones, i.e., responsibility, reproducibility, openness, diversity, impact, etc. This is not frictionless. Bart Penders examines contemporary languages of research assessment and reform, particularly ones that posit the sciences as in decline and in pressing need of change, and finds that critics use the trope of nostalgia strategically, namely, for overcoming the dichotomy of tradition and innovation.

Evidence targeting systemic problems impeding corrections contributes today to intense debates about the relation between scientific work (for the benefit of societies) and technological innovation (in global capitalist markets). With the rise of metrics, new forms emerge for “gaming” systems of rewards (Biagioli & Lippman, 2020). Mario Biagioli elaborates on why it is so challenging to contest and correct, what he calls, ‘new species’ of misconduct. Another systemic issue is the way in which the pharmaceutical industry has leveraged and biased the corpus of medical science in service of company goals (Sismondo, 2020, 2021). Sergio Sismondo further develops the point, focusing on the difficulties of identifying and dismantling ‘epistemic corruption’. Yet, financial markets and commercial targets may at times indicate directions for science correction. For example, Declan Kuch and Nicolas Rasmussen trace a nano-medical discovery that attracted significant investment but failed in clinical trials, leading to the collapse of a major biotech company and dwindling interest from academic scientists.

In other cases, flawed discoveries remain uncorrected for a long time. Douglas Allchin (2015) narrates how sloppy or erroneous results can persist for decades without being identified, arguing that scientific self-correction is a myth. However, research communities often draw less dramatic conclusions as to why an error has persisted in their field. For instance, in the case of the miscalculated number of human chromosomes, Aryn Martin has shown that cytologists created 'common sense' explanations for why the error went undetected, which were “both technologically determinist... and asymmetrical” (2004, p. 924). Sahana Srinivasan delves further into error non-correction, analyzing how a refuted hypothesis persists in Alzheimer’s research.

Elsewhere, critics use infamous cases of research misconduct (e.g., Jan Hendrik Schön, Hwang Woo-suk, Diederik Stapel, Paolo Macchiarini, and Olivier Voinnet) to showcase the shortcomings of self-correction on a systemic level, since fraudulent activities in all those cases carried on for years, undetected by journals, colleagues and university bureaucracies alike. Practitioners in corrective projects suggest and implement various measures to address systemic flaws in knowledge validation. Anne Clinio is diving deep into one of them, the ‘open science notebook’ of the late Jean-Claude Bradley, which originated from discussions about open knowledge and structural flaws in a system based on trust.

In recent years, practitioners have also tried to improve peer review, especially with the rise of digital platforms. Observing how peer review and the expectations surrounding it have evolved since the 17th century, Serge Horbach and Willem Halffman (2018) argue that these different expectations create tensions in contemporary debates. New online review practices, especially post-publication peer review, serve as "incubators" for developing cultures of trial and error. Practitioners often use blogs, social media, and platforms such as PubPeer to document their attempts to uncover misconduct, as well as discuss barriers they encounter along the way. While this uncovers the systemic fragilities of peer-review, Mady Barbeitas asks whether errors or hoaxes in the scientific record are only examples of blatant failure, or whether they could potentially serve as means to scout and tinker with the inner workings of peer review, scientific article corrections, and retractions.

One central topic in discussions around science correction is the ambivalent role of faithful replications. There is growing evidence that experimental replication has been a neglected practice: graduate students are often unprepared to cope with experimental irreplication or failure (Lubega et al., 2023) and, in many fields, research teams replicate each other’s results primarily in ‘integrative’ ways, adjusted to suit the interests of the replicating team (Peterson & Panofsky, 2021). For many, replications are deemed ‘unproductive’ for advancing scientific knowledge [2]. This assessment has only recently been challenged (Derksen & Morawski, 2024; Penders & Janssens, 2018), showing the generative potential of replications, for example in driving self-reflection in social psychology and opening discussions on subject matter variability (in this case, human behavior), as Maarten Derksen explains.

At the same time, we notice that reflections stemming from practical experience are rarely taken seriously as credible accounts worth analyzing, systematizing, and integrating in the scientific record as more-than-anecdotes. In fact, prominent voices in metascience promote a universalizing rhetoric that posits the disciplined experimental protocol and effectively managed bureaucracies as the main sources of corrective potential (Penders, 2022). Speaking in a broad and universal idiom often glosses over crucial differences among research fields and disregards infrastructural needs for science correction. Sheena Fee Bartscherer unpacks elements of the discourse of replication as a social movement, and discusses the wide range of stances, motivations and interests of up-and-coming reformers. In a complementary manner, Martin Bush, Cyrus Mody, Nicole Nelson, Maha Said, and Candida Sánchez Burmester discuss the requirements for doing replications across fields of research and note that these are very different.

To conclude, we believe that learning more about diversity in scientific knowledge cultures, here exemplified by the heterogeneous tales of science correction that the series covers, goes hand-in-hand with critical advocacy against an undifferentiated ‘replication drive’ (Holbrook et al., 2019), which would enforce same standards of conduct and evaluation across sciences and the humanities. By foregrounding the experiences and perspectives of cultures of trial and error, we believe this edited collection contributes to an appreciation of diversity in knowledge creation and correction across science and society.

Acknowledgements

The blog post series resulted from a round table at the 2024 EASST/4S conference and has involved most of the roundtable organizers and speakers as series’ contributors. We are very grateful to them for joining this series and for taking the time to write thought-provoking contributions. We would also like to thank Chakalaka Films for recording the round table and for transforming it into a podcast. Early versions of this editorial benefitted greatly from the friendly readings of Anne Clinio, Maarten Derksen, Sheena Fee Bartscherer and Nicole Nelson. Thanks goes out to the two peer reviewers as well, for their encouraging feedback and constructive criticism. Finally, we want to thank the editor of the Blog of Trial and Error, Marcel Hobma, for an excellent collaboration, as well as the whole JOTE team, especially Stefan Gaillard, for organizing peer-review for 12 posts.

This blog post series has been financially supported by 'NanoBubbles: how, when and why does science fail to correct itself', a project that has received Synergy grant funding from the European Research Council (ERC), within the European Union’s Horizon 2020 programme, grant agreement no. 951393. Mady Barbeitas and Candida F. Sánchez Burmester have individually received funding from the ERC-synergy project 'NanoBubbles', with the grant agreement no. 951393.

[1] Prominent examples of feminist technoscience spaces include the Civic Laboratory for Environmental Action Research (https://civiclaboratory.nl/) and the Access in the Making (AIM) Lab (https://accessinthemaking.ca/). Broadly speaking, feminist technoscience considers gender, along with other structures of power and oppression, to play a constitutive role in science and technology. The hybrid term ‘technoscience’ challenges the idea that there is a clear demarcation line between “pure” science and its downstream applications.

[2] This convention can be understood to some extent, if one were to look into the historical roots of the scientific publication regime. A prevalent focus of scientists, publishers and regulators onto ‘originality’ has facilitated the emergence and, ever since, relative stability of the modern function of the scientific author as owner of the reported work and as party responsible for the fruits of their labor (Biagioli, 2003, p. 256).

References

Allchin, D. (2015). Correcting the ‘self-correcting’ mythos of science. Filosofia E História Da Biologia, 10(1), 19-35.

Biagioli, M. (2003). Rights or Rewards? Changing Frameworks of Scientific Authorship. In M. Biagioli & P. Galison (Eds.), Scientific Authorship (pp. 253–280). Routledge.

Biagioli, M., & Lippman, A. (Eds.). (2020). Gaming the Metrics: Misconduct and manipulation in academic research. MIT Press.

Callard, F. (2022). Replication and Reproduction: Crises in Psychology and Academic Labour. Review of General Psychology, 26(2), 199–211.

Derksen, M., & Morawski, J. (2024). Replications are informative, particularly when they fail. Theory & Psychology, 34(5), 597–603.

Franzen, S. (2021). University Responsibility for the Adjudication of Research Misconduct: The Science Bubble. Springer.

Gelman, A. (2013). It’s Too Hard to Publish Criticisms and Obtain Data for Replication [Ethics and Statistics], 26(3), 49–52. http://www.stat.columbia.edu/~gelman/research/published/ChanceEthics8.pdf

Guston, D. H. (2000). Between Politics and Science: Assuring the Productivity and Integrity of Research. Cambridge University Press.

Halffman, W., & Radder, H. (2015). The Academic Manifesto: From an Occupied to a Public University. Minerva, 53(2), 165–187.

Holbrook, B. J., Penders, B., & de Rijcke, S. (2019, January 21). The humanities do not need a replication drive. CWTS. https://www.cwts.nl/blog?article=n-r2v2a4

Horbach, S. P. J. M. S., & Halffman, W. (2018). The changing forms and expectations of peer review. Research Integrity and Peer Review, 3, 8.

Janz, N., & Freese, J. (2019). Good and Bad Replications in Political Science: How Replicators and Original Authors (Should) Talk to Each Other. https://www.mzes.uni-mannheim.de/openscience/wp-content/uploads/2019/01/Janz-Freese_-Good-and-Bad-Replications-1.pdf

Jayaraman, K. S. (1987). Healthy scientific environment promoted by society in India. Nature, 326(6113), 535. https://www.nature.com/articles/326535b0.pdf

Leong, H. S., Butler, K. S., Brinker, C. J., Azzawi, M., Conlan, S., Dufés, C., Owen, A., Rannard, S., Scott, C., Chen, C., Dobrovolskaia, M. A., Kozlov, S. V., Prina-Mello, A., Schmid, R., Wick, P., Caputo, F., Boisseau, P., Crist, R. M., McNeil, S. E., . . . Pastore, C. (2019). On the issue of transparency and reproducibility in nanomedicine. Nature Nanotechnology, 14(7), 629–635.

Lévy, R. (2022). Is it somebody else’s problem to correct the scientific literature? Rapha-Z-Lab. https://raphazlab.wordpress.com/2022/12/15/is-it-somebody-elses-problem-to-correct-the-scientific-literature/

Loher, D., & Strasser, S. (2019). Politics of precarity: neoliberal academia under austerity measures and authoritarian threat. Social Anthropology, 27(S2), 5–14.

Lubega, N., Anderson, A., & Nelson, N. C. (2023). Experience of irreproducibility as a risk factor for poor mental health in biomedical science doctoral students: A survey and interview-based study. PLOS ONE, 18(11), e0293584.

Martin, A. (2004). Can’t Any Body Count? Social Studies of Science, 34(6), 923–948.

Mcnutt, M. (2014, July 11). Journals unite for reproducibility. American Association for the Advancement of Science.

Pashler, H., & Wagenmakers, E.‑J. (2012). Editors' Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence? Perspectives on Psychological Science : A Journal of the Association for Psychological Science, 7(6), 528–530.

Penders, B. (2022). Process and Bureaucracy: Scientific Reform as Civilisation. Bulletin of Science, Technology & Society, 42(4), 107–116.

Penders, B., & Janssens, A. C. J. W. (2018). Finding Wealth in Waste: Irreplicability Re-Examined. BioEssays, 40(12), e1800173.

Peterson, D., & Panofsky, A. (2021). Self-correction in science: The diagnostic and integrative motives for replication. Social Studies of Science, 1-23.

Shahare, M., & Roberts, L. L. (2020). Historicizing the crisis of scientific misconduct in Indian science. History of Science, 58(4), 485–506.

Sismondo, S. (2020). Ghost-­Managing and Gaming Pharmaceutical Knowledge. In M. Biagioli & A. Lippman (Eds.), Gaming the Metrics: Misconduct and manipulation in academic research (pp. 123–134). MIT Press.

Sismondo, S. (2021). Epistemic Corruption, the Pharmaceutical Industry, and the Body of Medical Science. Frontiers in Research Metrics and Analytics, 6, 614013.

Täuber, S. (2022). Women Academics' Intersectional Experiences of Policy Ineffectiveness in the European Context. Frontiers in Psychology, 13, 810569.

Welsh, J. (2021). Stratifying academia: ranking, oligarchy and the market‐myth in academic audit regimes. Social Anthropology, 29(4), 907–927.


Latest

Video: Professor Peter Bergeijk's 'On the inaccuracies of economic observations'

Read

Founding JOTE: A Conversation with Stefan Gaillard: “I Believe That Positive Publication Bias is Actively Harming Science”

Read

Founding JOTE: A Conversation with Martijn van der Meer. “My Interests Became More Political. I Wanted to Contribute to a Mission-Driven Model of Scientific Publishing”

Read