Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Initial evidence of research quality of registered reports compared with the standard publishing model

Abstract

In registered reports (RRs), initial peer review and in-principle acceptance occur before knowing the research outcomes. This combats publication bias and distinguishes planned from unplanned research. How RRs could improve the credibility of research findings is straightforward, but there is little empirical evidence. Also, there could be unintended costs such as reducing novelty. Here, 353 researchers peer reviewed a pair of papers from 29 published RRs from psychology and neuroscience and 57 non-RR comparison papers. RRs numerically outperformed comparison papers on all 19 criteria (mean difference 0.46, scale range −4 to +4) with effects ranging from RRs being statistically indistinguishable from comparison papers in novelty (0.13, 95% credible interval [−0.24, 0.49]) and creativity (0.22, [−0.14, 0.58]) to sizeable improvements in rigour of methodology (0.99, [0.62, 1.35]) and analysis (0.97, [0.60, 1.34]) and overall paper quality (0.66, [0.30, 1.02]). RRs could improve research quality while reducing publication bias and ultimately improve the credibility of the published literature.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Posterior probability distributions for parameter estimates from within-subjects analysis using partial pooling across the 19 outcomes comparing the difference of RRs and comparison articles.

Similar content being viewed by others

Data availability

All data files are available on OSF: https://osf.io/aj4zr/.

Code availability

All files and scripts are available on OSF: https://osf.io/aj4zr/.

References

  1. Chambers, C. What’s next for registered reports? Nature 573, 187–189 (2019).

    Article  CAS  Google Scholar 

  2. Chambers, C. The registered reports revolution. Lessons in cultural reform. Significance 16, 23–27 (2019).

    Article  Google Scholar 

  3. Nosek, B. A. & Lakens, D. Registered reports: a method to increase the credibility of published results. Soc. Psychol. 45, 137–141 (2014).

    Article  Google Scholar 

  4. Nosek, B. A., Spies, J. R. & Motyl, M. Scientific utopia: II. restructuring incentives and practices to promote truth over publishability. Perspect. Psychol. Sci. 7, 615–631 (2012).

    Article  Google Scholar 

  5. Smith, R. Peer review: a flawed process at the heart of science and journals. J. R. Soc. Med. 99, 178–182 (2006).

    Article  Google Scholar 

  6. Fanelli, D. Negative results are disappearing from most disciplines and countries. Scientometrics 90, 891–904 (2012).

    Article  Google Scholar 

  7. Fanelli, D. ‘Positive’ results increase down the hierarchy of the sciences. PLoS ONE 5, e10068 (2010).

    Article  Google Scholar 

  8. Franco, A., Malhotra, N. & Simonovits, G. Publication bias in the social sciences: unlocking the file drawer. Science 345, 1502–1505 (2014).

    Article  CAS  Google Scholar 

  9. Dickersin, K. The existence of publication bias and risk factors for its occurrence. JAMA 263, 1385–1389 (1990).

    Article  CAS  Google Scholar 

  10. Mahoney, M. J. Publication prejudices: an experimental study of confirmatory bias in the peer review system. Cogn. Ther. Res. 1, 161–175 (1977).

    Article  Google Scholar 

  11. Greenwald, A. G. Consequences of prejudice against the null hypothesis. Psychol. Bull. 82, 1–20 (1975).

    Article  Google Scholar 

  12. Sterling, T. D. Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versa. J. Am. Stat. Assoc. 54, 30–34 (1959).

    Google Scholar 

  13. Makel, M. C., Plucker, J. A. & Hegarty, B. Replications in psychology research: How often do they really occur? Perspect. Psychol. Sci. 7, 537–542 (2012).

    Article  Google Scholar 

  14. Schmidt, S. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Rev. Gen. Psychol. 13, 90–100 (2009).

    Article  Google Scholar 

  15. Makel, M. C. & Plucker, J. A. Facts are more important than novelty. Educ. Res. 43, 304–316 (2014).

    Article  Google Scholar 

  16. Schimmack, U. The ironic effect of significant results on the credibility of multiple-study articles. Psychol. Methods 17, 551–566 (2012).

    Article  Google Scholar 

  17. Giner-Sorolla, R. Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspect. Psychol. Sci. 7, 562–571 (2012).

    Article  Google Scholar 

  18. Begley, C. G. & Ellis, L. M. Raise standards for preclinical cancer research. Nature 483, 531–533 (2012).

    Article  CAS  Google Scholar 

  19. Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we rely on published data on potential drug targets? Nat. Rev. Drug Discov. 10, 712–712 (2011).

    Article  CAS  Google Scholar 

  20. Camerer, C. F. et al. Evaluating replicability of laboratory experiments in economics. Science 351, 1433–1436 (2016).

    Article  CAS  Google Scholar 

  21. Camerer, C. F. et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat. Hum. Behav. 2, 637–644 (2018).

    Article  Google Scholar 

  22. Klein, R. A. et al. Many Labs 2: investigating variation in replicability across samples and settings. Adv. Methods Pract. Psychol. Sci. 1, 443–490 (2018).

    Article  Google Scholar 

  23. Klein, R. A. et al. Investigating variation in replicability: a ‘many labs’ replication project. Soc. Psychol. 45, 142–152 (2014).

    Article  Google Scholar 

  24. Open Science Collaboration. Estimating the reproducibility of psychological science. Science 349, aac4716 (2015).

  25. Ebersole, C. R. et al. Many Labs 3: evaluating participant pool quality across the academic semester via replication. J. Exp. Soc. Psychol. 67, 68–82 (2016).

    Article  Google Scholar 

  26. Allen, C. & Mehler, D. M. A. Open science challenges, benefits and tips in early career and beyond. PLoS Biol. 17, e3000246 (2019).

    Article  CAS  Google Scholar 

  27. Scheel, A. M., Schijen, M. & Lakens, D. An excess of positive results: comparing the standard psychology literature with registered reports. Preprint at PsyArXiv https://osf.io/p6e9c (2020).

  28. Hummer, L. T., Singleton Thorn, F., Nosek, B. A. & Errington, T. M. Evaluating registered reports: a naturalistic comparative study of article impact. Preprint at OSF https://osf.io/5y8w7 (2017).

  29. Cropley, A. Research as artisanship versus research as generation of novelty: the march to nowhere. Creat. Res. J 30, 323–328 (2018).

    Google Scholar 

  30. Baumeister, R. F. Charting the future of social psychology on stormy seas: winners, losers, and recommendations. J. Exp. Soc. Psychol. 66, 153–158 (2016).

    Article  Google Scholar 

  31. Nosek, B. A. & Errington, T. M. The best time to argue about what a replication means? Before you do it. Nature 583, 518–520 (2020).

    Article  CAS  Google Scholar 

  32. Gelman, A., Hill, J. & Yajima, M. Why we (usually) don’t have to worry about multiple comparisons. J. Res. Educ. Eff. 5, 189–211 (2012).

    Google Scholar 

  33. Epskamp, S. & Nuijten, M. B. statcheck: extract statistics from articles and recompute P values. R package version 1.3.1 (2018).

  34. Hardwicke, T. E. & Ioannidis, J. P. A. Mapping the universe of registered reports. Nat. Hum. Behav. 2, 793–796 (2018).

    Article  Google Scholar 

  35. Chambers, C. D. & Mellor, D. T. Protocol transparency is vital for registered reports. Nat. Hum. Behav. 2, 791–792 (2018).

    Article  Google Scholar 

  36. John, L. K., Loewenstein, G. & Prelec, D. Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol. Sci. 23, 524–532 (2012).

    Article  Google Scholar 

  37. Simmons, J. P., Nelson, L. D. & Simonsohn, U. False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22, 1359–1366 (2011).

    Article  Google Scholar 

  38. Nuijten, M. B., van Assen, M. A. L. M., Hartgerink, C. H. J., Epskamp, S. & Wicherts, J. M. The validity of the tool ‘statcheck’ in discovering statistical reporting inconsistencies. Preprint at PsyArXiv https://osf.io/tcxaj (2017).

  39. Stan Development Team. Stan Modeling Language Users Guide and Reference Manual (2020).

Download references

Acknowledgements

The authors thank L. Hummer for help with study planning, A. Denis and Z. Loomas for help preparing survey materials, B. Bouza and N. Buttrick for help with implementing the survey in Qualtrics and A. Allard for help coding the articles. This research was funded by grants from Arnold Ventures and James S. McDonnell Foundation (grant # 220020498) to B.A.N. and supported by the National Science Foundation Graduate Research Fellowship Program (grant # 1247392 awarded to S.R.S). The funders had no role in study design, analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Survey: C.K.S., T.M.E. and B.A.N. Article coding: S.R.S., J.B. and S.V. Data curation: Survey: C.K.S. Article coding: S.R.S. and J.B. Formal analysis: Survey: C.K.S. and K.M.E. Article coding: S.V., S.R.S. and J.B. Investigation: Survey: C.K.S. and T.M.E. Article coding: S.R.S., J.B. and S.V. Methodology: Survey: C.K.S., T.M.E., K.M.E. and B.A.N. Article coding: S.R.S., J.B. and S.V. Software: Article coding: S.R.S. and J.B. Visualization: Survey: C.K.S. Article coding: S.R.S. and J.B. Validation: Survey: K.M.E. Article coding: S.V. Project administration: T.M.E. Resources: T.M.E. and F.S.T. Supervision: T.M.E. and B.A.N. Funding acquisition: T.M.E. and B.A.N. Writing, original draft: C.K.S., T.M.E., K.M.E. and B.A.N. Writing, review and editing: C.K.S., T.M.E., S.R.S., J.B., F.S.T., S.V., K.M.E. and B.A.N.

Corresponding author

Correspondence to Brian A. Nosek.

Ethics declarations

Competing interests

T.M.E., C.K.S. and B.A.N. are employees of the nonprofit Center for Open Science (COS), which has a mission to increase openness, integrity and reproducibility of research. COS offers support to journals, editors and researchers in adopting and conducting RRs. The remaining authors declare no conflicts of interest.

Additional information

Peer review information Nature Human Behaviour thanks Balazs Aczel, Marcel van Assen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Plot of correlations between all difference score outcome variables.

Correlation matrix of the 19 outcome variables with larger darker blue circles indicating stronger positive correlations than smaller lighter blue circles.

Extended Data Fig. 2 Posterior probability distributions for parameter estimates for each DV and each level of Familiar comparing the difference of RRs and comparison articles.

Horizontal lines indicate 80% (thick) and 95% (thin) credible intervals and dots show the mean of the posteriors. Positive values indicate a performance advantage for RRs, negative values indicate a performance advantage for comparison articles.

Extended Data Fig. 3 Posterior probability distributions for parameter estimates for each DV and each level of Improve comparing the difference of RRs and comparison articles.

Horizontal lines indicate 80% (thick) and 95% (thin) credible intervals and dots show the mean of the posteriors. Positive values indicate a performance advantage for RRs, negative values indicate a performance advantage for comparison articles.

Extended Data Fig. 4 Posterior probability distributions for parameter estimates for each DV and each level of ‘Guessed Right’ comparing the difference of RRs and comparison articles.

Horizontal lines indicate 80% (thick) and 95% (thin) credible intervals and dots show the mean of the posteriors. Positive values indicate a performance advantage for RRs, negative values indicate a performance advantage for comparison articles.

Supplementary information

Supplementary information

Supplementary Methods, Supplementary Discussion, Supplementary Figs. 1–6, Supplementary Tables 1–10 and Supplementary References.

Reporting summary

Peer review information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Soderberg, C.K., Errington, T.M., Schiavone, S.R. et al. Initial evidence of research quality of registered reports compared with the standard publishing model. Nat Hum Behav 5, 990–997 (2021). https://doi.org/10.1038/s41562-021-01142-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41562-021-01142-4

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing