25. Reproducible Workflow
From Analysis to Transparent Reporting
1 Introduction
A sound analysis is not complete until it is transparent, traceable, and reproducible. Reproducibility is not an optional extra added after the statistics are finished. It is the workflow that keeps data, code, figures, tables, interpretation, and reporting connected from the start.
In practical terms, reproducibility means that someone else, or you six months later, can see what was done, why it was done, and how the reported results were generated.
2 Key Concepts
- Reproducibility means the analysis can be traced and rerun from source.
- Transparency means the analytical choices are visible, not hidden in undocumented steps.
- Literate workflow keeps code, output, and narrative close together.
- Diagnostics and limitations are part of the reproducible record, not optional extras.
- Good workflow supports better science, not just better formatting.
3 Core Principles
The practical principles are straightforward:
- keep raw data separate from derived outputs;
- keep code and narrative connected where possible;
- distinguish exploratory work from confirmatory claims;
- document the analytical decisions that affect interpretation;
- make tables and figures reproducible from source rather than by hand editing.
4 Tools and Practice
In this module, reproducibility is supported mainly through:
- R scripts;
- Quarto documents;
- stable project structure;
- explicit reporting of assumptions, diagnostics, and limitations.
The practical habit to build is simple: if a figure, table, or model result appears in your report, there should be a clear path back to the code and data that generated it.
5 A Practical Workflow
An effective reproducible workflow often looks like this:
- organise the project clearly;
- keep data import, cleaning, analysis, and reporting traceable;
- generate outputs from code rather than manual editing;
- write interpretations close to the analysis that supports them;
- leave enough information for another person to rerun the analysis.
6 Common Failures
The most common failures of reproducibility are not technical sophistication problems. They are workflow problems:
- analyses performed interactively without a saved script;
- figures edited by hand after export;
- undocumented exclusions or transformations;
- final reports disconnected from the code that generated the results;
- no clear distinction between exploratory and confirmatory steps.
7 Summary
- Reproducibility links data, code, output, interpretation, and reporting.
- A transparent workflow makes analyses easier to evaluate, revise, and trust.
- Quarto and scripted analysis support reproducibility by design.
- Reporting diagnostics, assumptions, and limitations is part of reproducible science.
This closes the basic_stats sequence by connecting statistical reasoning to scientific practice. A good analysis is not only statistically defensible. It is also traceable from source to conclusion.
Reuse
Citation
@online{smit,_a._j.2026,
author = {Smit, A. J., and J. Smit, A.},
title = {25. {Reproducible} {Workflow}},
date = {2026-03-19},
url = {http://tangledbank.netlify.app/BCB744/basic_stats/25-reproducible-workflow.html},
langid = {en}
}
