Building R Packages for Statistical Inference

May 8, 2020 2 min read Updated: February 24, 2026

R statistics statistical-computing open-source

The best thing about this math degree so far: I’m not just using R packages anymore. I’m building them.

Why R

R was built for statistics. Not adapted to it, not retrofitted. The language maps statistical concepts naturally: distributions are first-class, formulas are a data type, data frames are the default structure. CRAN has decades of mature, well-tested statistical libraries.

And RMarkdown lets me write code, math, and prose in one document. For statistical work, that matters. A derivation that lives next to the code that implements it is worth more than either alone.

What I’m Building

R packages for reliability analysis. Specifically:

Maximum likelihood estimation for series systems with masked failure data. A system fails, you know that it failed, but not which component caused it. The likelihood function for this scenario is non-trivial, and the existing tools don’t handle it well.

Bootstrap confidence intervals for reliability metrics that don’t have closed-form variance expressions. When you can’t derive the variance analytically, you resample.

Survival analysis tools for right-censored Weibull data. Components are still running when observation ends. You have to account for what you didn’t see.

How I Build Them

I treat package development like API design:

Functions do one thing
Small functions compose into larger workflows
Every function has examples and mathematical background in the docs
Automated tests, not just manual checks
Vignettes that walk through complete analyses

The goal is that someone reading the vignette learns both the method and the implementation.

Why Open Source

Statistical methods that aren’t implemented might as well not exist. If I publish a paper claiming a method works, the code should be available for anyone to check. Publishing clean, documented R packages means my results are reproducible, my methods are auditable, and other people can build on them without starting from scratch.

These packages will eventually form part of my thesis. Building the tools while learning the theory forces me to understand both more deeply than either alone.

Why R

What I’m Building

How I Build Them

Why Open Source

Related Posts

Likelihood Model

hypothesize: API for Hypothesis Testing

Algebra over distribution (random elements) objects

Algebraic Maximum Likelihood Estimators

Dynamic failure rate (DFR) distributions

Discussion