algebraic.dist: Distributions as Algebraic Objects in R
An R package that treats probability distributions as algebraic objects. They compose through standard operations. The algebra preserves distributional structure.
Maximum likelihood estimation for series systems with masked failure data, from master's thesis to R package ecosystem
A series system fails when any component fails. You observe system-level failure times. But you often cannot tell which component caused the failure (masking), and some systems are still running when testing ends (censoring). Given this incomplete data, estimate the reliability of individual components.
This is the problem at the center of my math master’s thesis and the ecosystem of R packages and papers that grew out of it.
The likelihood combines survival analysis (censored observations), mixture models (unknown failure cause), and constrained optimization (parameter estimation from incomplete data). Under three conditions on the masking mechanism (C1: candidate set contains the true cause, C2: symmetric masking probabilities, C3: masking independent of parameters), the likelihood factors into a clean form that works for any lifetime distribution family.
I started this in 2020 during the math master’s. The thesis focused on Weibull components with right censoring. The likelihood derivation was distribution-agnostic, but I did not treat the general framework as its own contribution. That came later.
After defending in 2023, I decomposed the monolithic thesis code into a layered R package ecosystem. Six packages are on CRAN. Four more are targeting CRAN and JOSS. The foundation paper extracts the general C1-C2-C3 framework and gives it a proper treatment. The exponential companion pushes the simplest case to closed-form MLE and analytical Fisher information.
algebraic.dist
| \
algebraic.mle \
| \ \
compositional.mle likelihood.model
| \
flexhaz maskedcauses
|
serieshaz
|
maskedhaz
Each layer solves one problem. algebraic.dist treats distributions as algebraic objects. algebraic.mle provides algebra for MLEs. likelihood.model composes likelihood contributions. flexhaz defines distributions from hazard functions. maskedcauses and maskedhaz handle masked series system data at different levels of generality (closed-form vs numerical).
Posts cover the thesis itself, the algebraic decomposition of the code, individual package announcements, the foundation and companion papers, model selection for the Weibull nesting chain, relaxed masking conditions, and a retrospective on the whole arc.
[Archived] Master's project (SIUE, 2023): MLE for series system reliability with Weibull components under right-censoring and masked failure data. See likelihood.model.series.md for active software.
Explore project →An R package that treats probability distributions as algebraic objects. They compose through standard operations. The algebra preserves distributional structure.
An R package where optimization solvers are first-class functions that compose through chaining, racing, and restarts.
The maskedcauses R package for MLE in series systems with masked component failures, built on composable likelihood contributions and validated through simulation.
Observation functors in maskedcauses: composable functions that separate the data-generating process from the observation mechanism, enabling mixed-censoring simulation and verified Monte Carlo studies.
A retrospective on three years of building R packages and writing papers for masked series system reliability, and what comes next.
My master's project on maximum likelihood estimation for series systems with right-censored and masked failure data.
My R package for hypothesis testing, hypothesize, is now available on CRAN.
Extending masked failure data analysis when the standard C1-C2-C3 masking conditions are violated.
When can reliability engineers safely use simpler models? Likelihood ratio tests on Weibull series systems give sharp boundaries.
Closed-form MLEs and Fisher information for exponential series systems with masked failure data. No numerical optimization required.
Maximum likelihood estimation of component reliability from masked failure data in series systems, with BCa bootstrap confidence intervals validated through extensive simulation studies.
I defended my mathematics thesis. Three years, stage 3 cancer, and a second master's degree. Here is what worked and what did not.
Numerical approaches to maximum likelihood estimation, covering the optimization methods and computational issues that come up in practice.
A generic R framework for composable likelihood models. Likelihoods are first-class objects that compose through independent contributions.
Weibull distributions model time-to-failure in reliability engineering and cancer survival. I study both professionally. One of them became personal.
An R package that gives hypothesis tests a consistent interface. Every test returns the same structure. You can write generic code that works across all of them.
Bootstrap resampling trades mathematical complexity for computational burden. When you can't derive the variance analytically, you resample. For my thesis work on masked failure data, that trade is essential.
An R package for specifying hazard functions directly instead of picking from a catalog of named distributions. You write the hazard. It handles the rest.
An R package that treats MLEs as algebraic objects. They carry Fisher information, compose through independent likelihoods, and propagate uncertainty correctly.
I'm building R packages for reliability analysis, not just using other people's. R's strengths for statistical computing are real, and building packages forces you to understand the theory.
I already have an MS in Computer Science. Now I'm going back for Mathematics and Statistics, because I kept hitting walls where I could use methods but not derive them.
Introduction to reliability analysis with censored data, where observations are incomplete but statistically informative.