I have been working on the same statistical problem since 2020. I am now a PhD student in CS. The problem has not changed, but my understanding of it has, and the tools I have built around it look nothing like what I started with.
The problem: a series system fails when any component fails. You observe system-level failure times. But you often cannot tell which component caused the failure (masking). Some systems are still running when testing ends (censoring). Given this incomplete data, estimate component reliability.
This is not a tutorial. It is a map of where things stand and where they are going.
How It Started
I picked this for my math master’s thesis because it combined everything I wanted to learn: survival analysis, mixture models, the EM algorithm, bootstrap inference, simulation studies. I focused on Weibull components with right censoring, but the likelihood derivation was distribution-agnostic. The general framework was there; I just did not treat it as the main contribution.
The implementation was one monolithic R package (wei.series.md.c1.c2.c3). It worked. It was not reusable. The distribution algebra, the MLE infrastructure, the likelihood model, the series system logic, the masking conditions: everything tangled together. Want to change the distribution family? Rewrite half the package.
I defended in October 2023. The thesis was fine. The code was a dead end.
Pulling It Apart
After the defense, I started asking: what are the actual algebraic structures here?
It took a while, but the answer was several independent layers:
Distributions form an algebra. Add, scale, compose. That became algebraic.dist.
MLEs form an algebra. Delta method, bootstrap, reparameterization. That became algebraic.mle.
Likelihood models are composable. Heterogeneous data is a sum of likelihood contributions, one per observation type. That became likelihood.model.
MLE solvers are composable. Chain them, race them, add restarts. The SICP closure property: composing solvers gives you a solver. That became compositional.mle.
Hazard functions define distributions. Write \(h(t; \theta)\), get \(S(t)\), \(F(t)\), \(f(t)\), quantiles, sampling, and MLE for free. That became flexhaz. A companion, serieshaz, composes component hazards into series system distributions.
The dependency graph:
algebraic.dist
| \
algebraic.mle \
| \ \
compositional.mle likelihood.model
| \
flexhaz maskedcauses
|
serieshaz
|
maskedhaz
None of this was planned. It emerged from repeatedly asking “why is this code so hard to change?” until the natural joints appeared.
AD and nabla
Preparing the master’s defense (which included BCa bootstrap confidence intervals) exposed the places where my understanding was mechanical rather than structural. I knew the formulas. I did not always know why they were the right formulas.
After the defense, I experimented with PyTorch’s autograd for MLE. I was already familiar with the relationship between gradients and score functions, but running it in a computational graph solidified it. The score function is the gradient of the log-likelihood. The observed information is the negative Hessian. Same computation, different names.
That led me to build nabla, which I think is the most general AD package in R. Nested dual numbers, S4 dispatch. D(f) gives gradients. D(D(f)) gives Hessians. Works through loops and branches. Arbitrary order, machine precision.
It is too slow for optimization. That is not the point. You find your MLE however you want (analytical, optim, whatever), then use nabla to characterize the solution: observed Fisher information, skewness analysis, higher-order diagnostics. The right tool for final analysis, not for search.
The Foundation Paper
In 2025, I wrote the paper that should have been separate from the start.
The thesis had the distribution-agnostic likelihood derivation buried in a document mostly about Weibull. The foundation paper extracts that general framework and gives it a proper treatment. It handles all four observation types (exact, right-censored, left-censored, interval-censored) and defines three conditions on the masking mechanism:
- C1: The candidate set always contains the true failed component.
- C2: Masking probabilities are symmetric across components.
- C3: Masking probabilities do not depend on the system parameters.
Under C1-C2-C3, the likelihood factors cleanly. The paper derives the general likelihood, score equations, and Fisher information, then instantiates them for exponential, Weibull, Pareto, log-normal, and gamma families.
The exponential companion pushes the simplest case as far as the math goes. Closed-form MLE. Analytical Fisher information. A proof that information loss from masking is monotone (more masking, less information, strictly). And a result I did not expect: uniform masking maximizes the entropy of the candidate set given the failure cause. It is the worst case for identifiability among all C2-compliant masking models. The closed-form results are not just a convenient special case. They are pessimistic bounds.
Where Things Are
Software
Six packages on CRAN, five more in progress:
| Package | What It Does | Status |
|---|---|---|
algebraic.dist | Distribution algebra | On CRAN |
algebraic.mle | MLE algebra | On CRAN |
likelihood.model | Fisherian likelihood framework | On CRAN |
compositional.mle | Composable MLE solvers | On CRAN |
hypothesize | Hypothesis testing framework | On CRAN |
nabla | Automatic differentiation (dual numbers) | On CRAN |
flexhaz | Distributions from hazard functions | Targeting CRAN + JOSS |
serieshaz | Series system distributions | Targeting CRAN + JOSS |
maskedcauses | Analytical MLE for masked series | Targeting CRAN + JOSS |
maskedhaz | Numerical MLE for masked series | Targeting CRAN + JOSS |
mdrelax | Relaxed masking conditions | Paper first; package if it pans out |
maskedcauses and maskedhaz solve the same problem at different levels of generality. maskedcauses has closed-form solutions for exponential and Weibull. maskedhaz handles arbitrary hazard functions via numerical integration. When both are installed, the test suites cross-validate.
mdrelax explores relaxing the C1, C2, and C3 conditions: informative masking, parameter-dependent masking, masking probabilities less than one. Right now it is research code for a paper. If the results hold up, a proper package may come out of it.
Papers
| Paper | Status |
|---|---|
| Foundation (C1-C2-C3 framework) | Draft complete |
| Exponential companion (closed-form FIM) | Draft complete |
| Model selection (LRT nesting chain) | Draft complete, software in maskedcauses vignette |
| Relaxed C1/C2/C3 conditions | Draft in progress |
| Master’s thesis (original Weibull treatment) | Published |
What I Am Working On Next
I have five companion paper directions. Some are real; some are barely more than a paragraph.
Identifiability (active)
The foundation paper proves a necessary and sufficient condition for identifiability from masked data. But it is a yes/no result. It says nothing about how much diagnostic separation you need in practice, which candidate set structures help most, or how identifiability degrades as masking gets heavier.
I want to give a graph-theoretic characterization (confounding graph, connected components as “super-components”), a linear algebra condition for the exponential case (rank of the candidate-set matrix), and simulation ablation studies.
The exponential companion connects directly: uniform masking at \(w = m-1\) is the most pessimistic identifiable scenario. So the question inverts: what masking design minimizes information loss? That is a D-optimality problem for the masking mechanism. I do not know the answer yet.
Observation scheme composition (idea)
maskedcauses implements composable observation functors: observe_right_censor(), observe_left_censor(), observe_periodic(), observe_mixture(). The mathematical content is that C1-C2-C3 are preserved under these compositions. If your masking mechanism satisfies the conditions, and you compose it with an independent censoring scheme, the composition still satisfies them.
The proofs are straightforward. The interesting question is whether there is a category-theoretic formulation. Observation schemes as morphisms, composition as functor composition. If so, the closure theorems become instances of a general principle rather than case-by-case arguments. I do not know if this leads anywhere useful or is just abstraction for its own sake.
Parsimony vs physical structure (idea)
If the data are heavily masked and the sample is small, a single Weibull fits the system lifetime data about as well as an \(m\)-component series model. Standard model selection says: use the simpler model. Engineering knowledge says: the components exist.
This is a real tension. If you want to predict system lifetimes, maybe the single Weibull is fine. If you want to estimate component reliability for maintenance planning, the series decomposition matters even if it is over-parameterized relative to the data. I think this is a short paper. Maybe just an essay with simulations.
Weibull companion (planned)
The exponential companion has closed-form results because exponential lifetimes are memoryless. Weibull has shape parameters, which means time-varying hazards and numerical integration for some censoring types.
The thesis covers Weibull estimation. The companion paper would redo it properly in the foundation paper’s notation: score equations, Fisher information, simulation studies, guidance for choosing homogeneous vs heterogeneous models. The theory exists in pieces across the thesis and maskedcauses. Writing it up as a standalone paper is substantial work.
CRAN and JOSS submissions
The immediate practical work. flexhaz, serieshaz, maskedcauses, and maskedhaz are all targeting both CRAN and JOSS. I have been through the CRAN gauntlet six times. It is tedious but not mysterious.
What I Have Learned
Three things.
Decompose first. The monolithic thesis code worked but could not grow. Pulling it into algebraic layers took months. Every piece became independently testable, reusable, publishable. The decomposition is itself a research contribution, not just engineering convenience.
Publish the general theory separately. The thesis had the distribution-agnostic framework buried in a Weibull-specific document. Writing the foundation paper forced me to separate what is structural (C1-C2-C3 factorization, observation type taxonomy) from what is distribution-specific. That separation made the companion papers possible.
The Fisher information matrix is the right lens. For two years I treated FIM as “the thing you invert to get standard errors.” The exponential companion forced me to see it as a measure of how much information the observation scheme actually provides. Once I saw that, the identifiability results, the information loss monotonicity, and the optimal design questions all fell out naturally. FIM connects the statistical theory to the practical question of how to design better diagnostics.
The work is not close to done. But the pieces are in place: a clean theoretical framework, a modular software stack, two completed papers, a clear map of what comes next. Most days, that is enough to keep going.
Discussion