Felix Rusche contributed as an analyst to a large-scale international replication study published in Nature on April 1
Large-scale study finds conclusions often diverge when hundreds of researchers reanalyze the same data
In the long run, a study published in Nature points to a promising path for strengthening the credibility of the social and behavioral sciences. In this large-scale international collaboration, nearly 500 independent analysts examined the same datasets and frequently arrived at different conclusions. The project delivers a clear message: scientific objectivity does not lie in identifying a single “true” analysis, but in making the space of plausible alternatives transparent—both in research reports and in communication with the broader scientific community.
A new study published in Nature, "Estimating the Analytic Robustness of Social and Behavioural Sciences," finds that scientific conclusions can shift dramatically depending on who conducts the analysis.
The results come from a large-scale international collaboration led by Balázs Aczél and Barnabás Szászi (Eötvös Loránd University and Corvinus University), conducted as part of the Systematizing Confidence in Open Research and Evidence (SCORE) program. A team of 457 independent analysts from institutions around the world conducted 504 reanalyses of data from 100 previously published studies across the social and behavioral sciences. All analysts received the same dataset and the same key research question, but were given freedom in how to conduct the analysis based on their informed judgment.
Over the past decade, the social and behavioral sciences have undergone substantial reforms aimed at making research more transparent, rigorous, and reliable. Preregistration, registered reports, replication studies, and checks of analytical reproducibility all seek to reduce the prevalence of chance findings and biased results. One important question, however, has received relatively little attention: to what extent do research findings depend on the specific way in which data are analyzed?
In standard scientific practice, a dataset is typically analyzed by a single researcher or research team, and the resulting publication presents the outcome of one particular analytical pathway. While peer review assesses methodological acceptability, it rarely reveals what results might have emerged under alternative, yet equally defensible, statistical decisions.
Yet empirical research involves numerous decision points: how data are cleaned, how variables are defined, which statistical models or software are used, and how results are interpreted. Together, these choices constitute what is known as analytic variability—the flexibility that can fundamentally influence final conclusions.
Key Findings
We observed substantial variation in the outcomes of independent analyses of the same question using the same data across 100 studies. Although most reanalyses broadly supported the main claims of the original studies, effect sizes, statistical estimates, and levels of uncertainty often differed meaningfully. All analysts reached the same conclusion as the original authors in about one third of cases.
Importantly, these discrepancies were not due to a lack of expertise. Experienced researchers with strong statistical backgrounds were just as likely to arrive at divergent results as others. At the same time, observational studies proved less robust than experimental ones, suggesting that more complex data structures allow greater analytical flexibility—and thus greater uncertainty.
Aczél, Professor at Eötvös Loránd University, concluded, “These findings do not call into question the credibility of prior research. Rather, they draw attention to the fact that presenting a single analysis often fails to reflect the true degree of empirical uncertainty, and that ignoring analytic variability can lead to unwarranted confidence in scientific conclusions.”
Szászi, Assistant Professor at Eötvös Loránd University and Corvinus University, added “We advocate for the broader use of multi-analyst and ‘multiverse’ approaches, especially for questions of high scientific or societal importance. Rather than seeking a single true answer, these approaches make visible how stable—or fragile—scientific conclusions really are.”
The study, published in Nature, is available here.
