In a series system, the system fails when any component fails. You observe the system failure time \(t\) and a candidate set \(C \subseteq \lbrace 1,2,\ldots,m\rbrace\) of components that might have caused the failure. But you do not know which component in \(C\) actually failed. This is masked failure data.
The standard approach is numerical optimization of the likelihood. This paper shows that for exponential component lifetimes, everything has a closed form.
Closed-Form Fisher Information
For exponential masked data with arbitrary masking patterns:
$$I_{ij}(\boldsymbol{\lambda}) = n \cdot \sum_{A \ni i,j} \frac{\hat{\omega}_A}{(\sum_{k \in A} \lambda_k)^2}$$where \(\hat{\omega}_A\) is the observed frequency of candidate set \(A\). You can compute asymptotic variances directly, check identifiability before running any estimation, and analyze optimization stability. All without fitting a model first.
Sufficient Statistics
The mean system lifetime and the candidate set frequency vector are sufficient statistics. That reduces an entire dataset to \(1 + \binom{m}{w}\) numbers, where \(w\) is the masking width.
This is a real simplification. All the statistical information in your data is captured by two things: how often each candidate set appears, and what the average failure time is. Nothing else matters for inference.
Closed-Form MLE for Three Components
For \(m=3\) components with pairwise masking (\(w=2\)), the MLE has an explicit closed-form solution:
$$\hat{\lambda}_j = \frac{\sum_{A \ni j} \hat{\omega}_A}{\bar{t} \cdot n}$$No numerical optimization. No iterative algorithms. Just plug in your sufficient statistics.
The \(w=2\) case is the interesting one. \(w=1\) means no masking (you know exactly which component failed). \(w=m\) means complete masking (the candidate set is always everything, so you have no diagnostic information). \(w=2\) is the simplest case where masking actually matters, and it is the one where closed-form solutions exist.
Asymptotic Theory
The MLE follows:
$$\sqrt{n}(\hat{\boldsymbol{\lambda}}_n - \boldsymbol{\lambda}^\star) \xrightarrow{d} \mathcal{N}(\mathbf{0}, \mathcal{I}^{-1}(\boldsymbol{\lambda}^\star))$$with explicit Wald-type confidence intervals using the closed-form Fisher information. So you get point estimates and uncertainty quantification, all analytically.
Why Exponential?
The exponential assumption is not just for tractability, though it helps. Constant hazard rate models systems subject to random external shocks. The memoryless property simplifies the likelihood structure. And exponential is the foundation for generalization to Weibull and other distributions.
More practically, exponential is the one case where closed-form solutions exist. The insights from the exponential case guide numerical methods for more complex lifetime distributions.
Connections
This paper is the analytical foundation for the generalized Weibull treatment in my master’s thesis, which handles shape and scale parameters, more realistic masking models, and model selection between candidate failure mechanisms.
The R packages likelihood.model and maskedcauses implement these methods.
Links
- Paper: Statistical Inference for Series Systems from Masked Failure Time Data: The Exponential Case
- Repository: github.com/queelius/expo-masked-fim
- Key result: Complete analytical treatment including MLE, Fisher information, sufficient statistics, and asymptotic theory
Discussion