~ 14 min read

JOTE in Conversation: Daniël Lakens. Whose Fault Is it That Science is Irreproducible? And Who the Heck Are Metascientists?

ByMax Bautista PerpinyaOrcID & Daniel LakensOrcID

After talking with Léa Roumazeilles, PhD candidate in Neurobiology at Oxford, we now present the second of our series ‘JOTE in conversation with researchers.’

In this series, we ask researchers how they experience 'failure' and 'success' within their research practices - whereby 'practices' can range from the research proposal and funding to experimental constraints and publication. How does failure arise in these different aspects? And how do researchers in different fields and career stages deal with failure?

This time, we spoke to Daniël Lakens, Associate Professor in the Human-Technology Interaction (HTI) group at Eindhoven University of Technology (TU/e). As a metascientist, Lakens' work consists of 'developing methods for critically reviewing and optimally structuring studies.' We talked with him about the reproduction crisis in psychology, and how practices, experimental methods, statistical approaches and funding affect the way errors are framed and published.

As I set the recorder to rec, I told Daniël: 'You don’t have to talk very loudly, the recorder is pretty good.' He responded, 'Alright, I’ll try.' We chuckled. More than the volume of his voice, Daniël’s words are loud and travel far because of his reflections on the status of science and the ethics of researchers may be poignant to some listeners. We hope this interview gives you something to reflect on.

What is metascience? How did it originate?

Metascience has been done for a long time, known as epistemology. However, metascience as we know it became very popular in psychology ten years ago. To help you understand, let me briefly explain what has happened in the field since then.

First, in psychology there was always an understanding that not everything was perfect but we didn’t really know how bad it really was, and it was easy to just to tell each other, ‘yeah, things are not perfect, but how bad can it be?’ It was extremely common, especially as PhD students, that we would tell each other, ‘we are flexibly analysing the data.’ It was a very convenient thing to say because you didn’t really have to think about how bad it actually was. At the same time, I remember my supervisor once told me, ‘yeah, this finding is not really reliable. We know this. We should have an independent committee that replicates this sort of finding, so that it becomes widely known in the literature.’ But nothing much was going on. That was sort of the state of the field.

Then in 2010, a paper came out that became really, really well known. This was a paper by Daryl Bem on precognition, the ability of people to predict the future. He did a classical task in psychology, with a variation. In the original task, you are presented with a picture on the left and on the right and press a button saying whether the stimulus is on the right or on the left. He flipped the task temporally, meaning that you had to press the buttons before they appeared, and he counted how often people guessed better than the guessing average. Apparently, most people could tell better than the guessing average. This was a nine-paper study that appeared in 2010 and circulated a lot around 2011. The editor said, ‘I don’t know what to do with this. This looks like any other psych paper that we have. Just because it’s a crazy topic and consists of nine studies, it is super convincing. I should accept this.’

After this, people had two choices: either to say that precognition is real since the evidence was extremely convincing, or that this indicated that there was a problem in the way that we worked. So many people opted for the second, unsurprisingly. From the moment the Bem papers came out, there were many responses of many different kinds, saying things like: statistical analysis is not good, it’s based on p-values and we should report base factors, etc. There were other strategies to understand what happened in Bem's studies. Then, some researchers (Simmons, Simonsohn, and Nelson, 2011) published what became a classic of the field, ‘false-positive psychology.’ First, they reported that they had played the song ‘When I am 64’ by The Beatles, and a control song, to people; they showed that those people listening to ‘When I am 64’ aged. They were older. And you were reading and said, ‘that’s impossible, that’s a nonsense finding.’ The authors said, ‘of course this is a nonsense finding. But this is not all that we did in the study.’ Then they added all the things that they left out, and their selective reporting became apparent: ‘we didn’t just ask this, we also asked this, this, and this. And we not only tested this, but also this, this and this. And if you have flexibility in your testing, then Type I error will inflate up to the point that it is really easy to find significant results.’ This paper made it very clear that the way that we work, and the answer to the question ‘how bad can it be’ was ‘well, it can be really, really bad.’ The important thing was that many people recognised the practices that this article described, and started thinking ‘okay, am I doing things the wrong way?’ People came around to the fact that methods were extremely important.

Parallelly, in the Netherlands there was the case of fraud by Diederik Stapel, who made up data. And that’s kind of boring in terms of the process, because he just typed random data. You are not supposed to do it, or so everyone agreed. So in the Netherlands you had these two things together, close in time. The Daryl Bem studies were the rational part, and the fraud case of Diederik Stapel was the emotional one. So it was the time when many people realised for the first time that there is a big issue with the way that we work, and we were all strongly motivated. We knew we had to do something about it. Now, the Netherlands is regarded by most people as one of the countries leading the change on these issues.

What is the role of journals in promoting good science?

They have a big role. Part of the key issues in disseminating tools for better science is educational. There’s a new journal - Advances in Methods and Practices in Psychological Science - that helps people to improve the way that they work in a very broad spectrum. There are papers about theory, measurement, concept formation, how we should collaborate and store data, informed consent - everything. All of this follows from each other: if you want more transparency you need to share your data and for this you need to get informed consent. But how do you do this for fMRI research where you scan people and have their face, for example? All sorts of practical and ethical considerations have come out of this, which has kept people busy for a very long time.

Another key development is the emergence of open access publications - especially the ones like PLoS One. For example, there was a classic case of a failure to replicate. It was a very highly-cited study about elderly priming, where people are presented subliminally with words related to the elderly and then they walk slower.

During a conference in 2009 or so, I remember talking to Stéphane Doyen, who was the first author on a paper that attempted to replicate it. He presented the study’s failure to replicate at the conference and I thought: ‘this is super cool that you’ve done this. Credits to you to try to replicate!’ I remember emailing him a couple of years later saying ‘hey, I was thinking about your study. Where is it?’ He said ‘yeah, unsurprisingly all journals rejected it’ because of course the original author was most likely one of the reviewers.

Then eventually it was published in PLoS One because it had an explicit rule: we don’t care about novelty, we care about the reliability of the findings. So there started to be outlets that gave people room to publish, for example, failures to replicate. This was an important development. There was a huge discussion about this finding; the original author wrote very harsh, impolite blog posts about it. It’s all very interesting to look back at. I’m sure he regrets it now but it tells you something about the state of the field: you couldn’t challenge a finding like this; people really thought that this was etched in stone but if we look at it now, these are studies with fourteen participants in each condition. C’mon.

There is a lot of innovation in how science is published, tell us about that.

There is a journal called Metapsychology that publishes about metascience. The journal itself is also a little bit of an experiment: it has open peer review and the submission is basically uploading your pre-print somewhere. It will take into account any comments that are made on the pre-print on social media or using Hypothes.is, which is a tool where you can annotate the web page and leave comments on the pre-print. It’s zero APC (author-processing charges) and they also have formats similar to what you try to do in your journal at JOTE; they have an ‘empty your file drawer’ format. This type of journal is not super popular yet but there are a couple of examples now. The problem is getting people to actually submit null results.

Do you know how positive publication bias (the file drawer problem) is being coped with at the moment?

In an indirect way, registered reports. Brian Nosek and I did a call for a special issue on this in 2012. Around the same time, Chris Chambers also proposed a similar idea, and he really pushed this at many journals - hundreds of journals now have it - where the decision to publish is made before the data is collected. It combats publication bias and it improves the quality a lot because the peer-review processes move to the front before the study is actually performed. This is a development that in practice works very well in reducing publication bias. Many people are trying it out. Not a lot of people, but some.

What about the role of funders, is it hard to get funding for replication research?

Not a lot is happening when it comes to funding. Sander L. Koole and I wrote a paper also in 2012 about the importance of rewarding replication research because who’s going to do a replication study when novel studies are so much more rewarded? And then when we wrote that paper I thought that we should at least try to reach out to somebody from NWO about this. We can just write a paper and put it in a journal but that seemed a bit boring. So we thought: if we want to be serious about this, we should send a letter to the NWO asking them to fund replication research. And so we did.

Initially they wrote a reply saying ‘we will fund replication research as long as it is innovative.’

[Silence]

Wait, what?

Chuckles, yes - I mean of course it’s not innovative. You’re just repeating something. So we thought, how can they officially reply like this? I think they didn’t completely understand. So I sent a letter back saying that basically this was destroying the foundations of science. If this is what you think then this is really problematic.

But even though their first response was odd, they’re actually a really nice organisation. They invited me over to talk about this topic and I tried to convince them that it would be a good idea to fund replication research. Crazy enough - it took them a long time, three years or something, but now they have a call to fund replication studies! It’s a pilot programme, so they’re evaluating it and taking a look at how they should extend it into the future, which is pretty nice.

A couple of other funders paid attention and thought: hmm okay, we should also take responsibility in making science more reproducible. This is good news, because I think we need to reserve a part of the fund to make it only available for replication research, otherwise novel research will always be valued more. The key point is that nobody will get a career by replicating other people’s work but we know that somebody needs to do it.

Do you think scientists behave ethically? Who in your opinion has the power to change the way science is done?

Anyone who would really follow the Code of Conduct in their lab nowadays would encounter pushback. If you ask them who’s responsible they might say ‘the head of my group is responsible’ or ‘I won’t get my PhD if I don’t do what my supervisor wants.’ In turn, the supervisor is going to say, ‘the dean of the faculty tells me that I have to publish this much, and do this and get these kinds of grants and so on. And how do I do that? By doing this kind of research.’ The dean is going to say ‘well, the university says that the government is forcing us to be such and such.’ So if you ask, ‘who’s responsible?’ The truth is that everybody is pointing towards each other but nobody is doing anything.

I’m interested in giving advice to the government or funders about these kinds of things. I don’t think it’s very popular amongst scientists to actually say people need to force us to do the right thing because they feel that it will make their lives more difficult. And yes, it will. But, honestly, I don’t care - I want to make my life more difficult. It is tax money that we’re spending and we should make sure that it is spent very efficiently.

What do you think is the role of metascience in all of this? Do you think this role of enforcing or making the rules is a job that metascientists could do?

Well first of all, will the general scientific community listen to metascience? No, of course not, because they will say ‘you’re just trying to get as much power as possible and do whatever your political agenda is.’ So I wouldn’t say that metascientists are supposed to determine these kinds of rules.

You should have a team of people who are interested in making advice for policy. I don’t think NWO should listen to just any single person or group of metascientsts because they are a skeptical bunch of people with very specific views about certain things and there are alternative viewpoints. Some people think we should focus on the big and the good stuff, for example.

I think metascientists can at least ask the questions that aren't asked. It’s amazing how bottom-up science organised itself over the last 500 years and how we keep things in place because ‘this is how it is.’ I think the role of metascientists is to ask: why is it like this? What are we doing? This doesn’t make sense.

For example, publication bias is a disaster for science. When I discuss it in workshops I don’t think people go home very happy and proud about being a scientist. I think they go home with at least some motivation not to contribute to publication bias too much in the future. This is the best I can achieve when I teach young researchers about this now.

Don’t you think that there is some sort of implicit bias that we are not able to see through?

Of course we will do things wrongly. This is a process where you try your best and in 50 years someone is going to say: look at these idiots. They thought they were improving Science in 2020. They missed this and this and this. That’s just the logical process in how we work.

You can’t justify everything. The real thing is to justify a little bit more than you’re doing now and ask yourself: why am I actually doing this? That’s the goal. I want to be able to stop following rules - especially norms.