Masked Failure Data: Looking Back, Looking Forward
A retrospective on three years of building R packages and writing papers for masked series system reliability, and what comes next.
Browse posts by tag
A retrospective on three years of building R packages and writing papers for masked series system reliability, and what comes next.
Observation functors in maskedcauses: composable functions that separate the data-generating process from the observation mechanism, enabling mixed-censoring simulation and verified Monte Carlo studies.
The maskedcauses R package for MLE in series systems with masked component failures, built on composable likelihood contributions and validated through simulation.
Applied Bayesian inference with computing methods. The standard Bayesian statistics reference.
An R package where optimization solvers are first-class functions that compose through chaining, racing, and restarts.
Rigorous graduate-level probability + statistics; useful for inference and ML foundations.
Standard rigorous text on estimation, hypothesis testing, and asymptotic theory.
My graduate coursework from SIUe's math program is up: time series, regression, computational stats, multivariate analysis, and statistical methods.
Define statistical models symbolically and automatically derive score functions, Hessians, and Fisher information. No numerical approximation.
My R package for hypothesis testing, hypothesize, is now available on CRAN.
Extending masked failure data analysis when the standard C1-C2-C3 masking conditions are violated.
When can reliability engineers safely use simpler models? Likelihood ratio tests on Weibull series systems give sharp boundaries.
Closed-form MLEs and Fisher information for exponential series systems with masked failure data. No numerical optimization required.
Maximum likelihood estimation of component reliability from masked failure data in series systems, with BCa bootstrap confidence intervals validated through extensive simulation studies.
A C++20 library for composing online statistical accumulators with numerically stable algorithms and algebraic composition.
I experiment with simple predictive / generative models to approximate Solomonoff induction for a relatively simple synthetic data-generating process.
I defended my mathematics thesis. Three years, stage 3 cancer, and a second master's degree. Here is what worked and what did not.
Graduate problem set solutions in computational statistics and numerical methods from my math master's at SIUe. Implementing things from scratch teaches you what the libraries are hiding.
In my paper, Reliability Estimation in Series Systems, I discarded a lot of research that may be interesting to pursue further. This one is about using homogeneous shape parameters for the Weibull series system, which can greatly simplify the …
Numerical approaches to maximum likelihood estimation, covering the optimization methods and computational issues that come up in practice.
A generic R framework for composable likelihood models. Likelihoods are first-class objects that compose through independent contributions.
Weibull distributions model time-to-failure in reliability engineering and cancer survival. I study both professionally. One of them became personal.
An R package that gives hypothesis tests a consistent interface. Every test returns the same structure. You can write generic code that works across all of them.
Problem sets for STAT 581 - Statistical Methods at SIUe, taught by Dr. Neath during Fall 2021.
An experiment is conducted to study the effect of fitness level on ego > strength. Random samples of college faculty members are selected from each
A randomized complete block design is used to study the effect of caliper on the measured diameters
An experiment is designed to investigate whether the time to drill holes in rock holes using wet or dry drilling.
A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts.
An experiment is conducted to study the effect of drilling method on drilling time. Each method (dry drilling, wet drilling) is used on $n = 12$ rocks.
An experiment to compare a new drug to a standard is in the planning stages. The response variable of interest is the clotting time (in minutes) of blood
The insulating life of protective fluids at an accelerated load is being studied. The experiment has been performed for four types of fluids, with $n = 5$
A factorial experiment is used to develop a nitride etch process on a single wafer plasma etching tool.
A soft drink bottler is interested in studying the effects on a filling process. A factorial experiment is run using three factors: percent carbonation (in %),
A paired comparisons design is used to study the effect of machine operator on > the measured running time (in secs.) of a fuse. A sample of $n = 10$ fuses is
An experiment is designed to test for systematic differences in the hardness > measurements provided by two devices (fixed effect, factor $A$).
The surface finish of metal parts made on $a=4$ machines is under > investigation. > Each machine can be run by one of $b=3$ operators.
This problem set covers the E-M algorithm for right-censored normal data with known variance.
A review of SAX (Symbolic Aggregate approXimation), a method for converting real-valued time series into symbolic representations with guaranteed distance lower bounds.
This problem set covers sampling from a Gamma distribution using Metropolis-Hastings and acceptance-rejection methods.
Bootstrap resampling trades mathematical complexity for computational burden. When you can't derive the variance analytically, you resample. For my thesis work on masked failure data, that trade is essential.
An R package for specifying hazard functions directly instead of picking from a catalog of named distributions. You write the hazard. It handles the rest.
This problem set covers multicollinearity in regression analysis and the marginal and partial effects of predictor variables, among other topics.
This is a problem set for STAT 482 - Regression Analysis at SIUe. These problem sets were given by Dr. Andrew Neath, a professor in the Department of Mathematics and Statistics at Southern Illinois University Edwardsville (SIUe) during the Fall 2022 …
This is a problem set for STAT 575 - Computational Statistics at SIUe. These problem sets were given by Dr. Qiang Beidi, a professor in the Department of Mathematics and Statistics at Southern Illinois University Edwardsville (SIUe) during the Summer …
An R package that treats MLEs as algebraic objects. They carry Fisher information, compose through independent likelihoods, and propagate uncertainty correctly.
Discrete multivariate analysis exam covering log-linear models and categorical data analysis.
Final exam for discrete multivariate analysis.
Problem set 10 for discrete multivariate analysis.
Problem set 5 for discrete multivariate analysis.
Problem set 6 for discrete multivariate analysis.
Problem set 7 for discrete multivariate analysis.
Problem set 8 for discrete multivariate analysis.
Problem set 9 for discrete multivariate analysis.
Problem sets for STAT 478 - Time Series Analysis at SIUe, taught by Dr. Beidi during Spring 2021.
Problem sets for STAT 579 - Discrete Multivariate Analysis at SIUe, taught by Dr. Andrew Neath during Spring 2021.
Time series analysis exam covering ARMA processes and model identification.
Time series analysis coursework.
Final exam for time series analysis course.
Problem set 3 for time series analysis.
Problem set 4 for time series analysis.
Problem set 5 for time series analysis.
Problem set 6 for time series analysis.
Time series analysis project.
An R package that treats probability distributions as algebraic objects. They compose through standard operations. The algebra preserves distributional structure.
I'm building R packages for reliability analysis, not just using other people's. R's strengths for statistical computing are real, and building packages forces you to understand the theory.
I already have an MS in Computer Science. Now I'm going back for Mathematics and Statistics, because I kept hitting walls where I could use methods but not derive them.
Introduction to reliability analysis with censored data, where observations are incomplete but statistically informative.