I develop almost everything in open source. People sometimes ask why I spend so much time on documentation, examples, and polish for free software.
The answer is simple: science should be reproducible, and code is increasingly central to scientific claims.
The Reproducibility Crisis
Academia has a problem. Published papers often cannot be reproduced. The reasons are mundane:
- Methods described too vaguely
- Data not available
- Code never released
- Dependencies undocumented
- Computational environment not preserved
This is not just inefficient. It undermines the scientific method. A result you cannot reproduce is not a result. It is an anecdote.
Code as Scientific Artifact
When your research involves computation (and whose doesn’t these days?), your code is part of your methodology. Hiding it is like a biologist refusing to describe their experimental protocol.
Open source is not charity. It is scientific rigor.
Why I Document Obsessively
Every library I publish includes:
- Clear installation instructions
- Reproducible examples
- API documentation
- Tests that demonstrate usage
- Version-controlled history showing evolution
This takes time. But it means someone in 2028 can:
- Understand what I did
- Reproduce my results
- Build on my work
- Find my errors
That last point matters. I want people to find my errors. That is how science works.
The Broader Point
Open source accelerates science by enabling replication, facilitating collaboration, preventing redundant work, and building cumulative knowledge. None of this works if the code stays on your laptop.
I will keep publishing everything. Not for recognition, but because science is a collective enterprise that only works if we show our work.
Discussion