Statistics
Browse posts by tag
Bayesian Data Analysis
Notes
Applied Bayesian inference with computing methods. The standard Bayesian statistics reference.
compositional.mle: SICP-Inspired Optimization
An R package where solvers are first-class functions that compose through chaining, racing, and restarts.
Introduction to Probability and Mathematical Statistics
Notes
Rigorous graduate-level probability + statistics; useful for inference and ML foundations.
Statistical Inference
Notes
Standard rigorous text on estimation, hypothesis testing, and asymptotic theory.
The Elements of Statistical Learning
Graduate Statistics Problem Sets Now Available
I've made my graduate coursework from SIUe's mathematics program available online, covering time series, regression, computational statistics, multivariate analysis, and statistical methods.
symlik: Symbolic Likelihood Models in Python
Introducing symlik - define statistical models symbolically and automatically derive score functions, Hessians, and Fisher information.
hypothesize: Now Available on CRAN
My R package for hypothesis testing, hypothesize, is now available on CRAN.
Likelihood Models for Series Systems with Masked Component Failure Data: An R Package for Maximum Likelihood Estimation
mdrelax: When Masking Conditions Don't Hold
Extending masked failure data analysis when traditional C1-C2-C3 conditions are violated.
Model Selection for Weibull Series Systems: When Simpler Models Suffice
When can reliability engineers safely use simpler models? This paper provides sharp boundaries through likelihood ratio tests on Weibull series systems.
Closed-Form Results for Masked Exponential Series Systems
This paper provides complete analytical results for maximum likelihood estimation in series systems with masked failure data under exponential component lifetimes. Unlike numerical approaches, everything here has a closed form.
The Problem
In series …
Statistical Inference for Series Systems from Masked Failure Time Data: The Exponential Case
Alea: A Modern C++ Library for Algebraic Random Elements
Fisher Flow: An Information-Geometric Framework for Sequential Estimation
Reliability Estimation in Series Systems: Maximum Likelihood Techniques for Right-Censored and Masked Failure Data
Maximum likelihood estimation of component reliability from masked failure data in series systems, with BCa bootstrap confidence intervals validated through extensive simulation studies.
Accumux: Compositional Online Statistical Reductions in C++
A modern C++20 library for compositional online data reductions with numerically stable algorithms and algebraic composition.
Approximations of Solomonoff Induction
I experiment with simple predictive / generative models to approximate Solomonoff induction for a relatively simple synthetic data-generating process.
Mathematics Master's Complete: Post-Mortem on Three Years
I defended my mathematics thesis yesterday. It’s done.
Three years. Two degrees. Stage 3 cancer. And now: MS in Mathematics and Statistics from SIUE.
October 13, 2023: Defense complete.
Time for a post-mortem on what worked, what didn’t, …
Model Selection for Reliability Estimation in Series Systems
Reliability Estimation in Series Systems Maximum Likelihood Techniques for Right-Censored and Masked Failure Data
Problem Set Solutions
I have a fairly broad interest in problem-solving, from problems in statistics to algorithms. Over the years, I’ve accumulated a collection of problem sets from graduate coursework and independent study. These represent solutions to challenging …
Model Selection in Weibull Series Systems
In my paper, Reliability Estimation in Series Systems, I discarded a lot of research that may be interesting to pursue further. This one is about using homogeneous shape parameters for the Weibull series system, which can greatly simplify the …
Numerical Methods for Maximum Likelihood Estimation
Numerical approaches to solving maximum likelihood estimation problems.
likelihood.model: Composable Statistical Inference in R
Most R packages hardcode specific likelihood models. likelihood.model provides a generic framework where likelihoods are first-class composable objects—designed to work seamlessly with algebraic.mle for maximum likelihood estimation.
The Core Concept …
Weibull Distributions: The Mathematics of Failure and Survival
The Weibull distribution models time-to-failure. In reliability engineering, that’s component lifetimes. In medicine, it’s survival times.
I’ve been studying Weibull distributions for my thesis on series system reliability. Then I …
hypothesize: A Consistent API for Statistical Tests in R
R’s hypothesis testing functions are inconsistent—t.test() returns different structures than chisq.test(), making generic workflows painful. hypothesize provides a unified API so any test returns the same interface: p-value, test statistic, …
STAT 581 - Statistical Methods - SIUe - Fall 2021
Problem sets for STAT 581 - Statistical Methods at SIUe, taught by Dr. Neath during Fall 2021.
Statistical Methods - STAT 581 - Exam 1
An experiment is conducted to study the effect of fitness level on ego > strength. Random samples of college faculty members are selected from each
Statistical Methods - STAT 581 - Exam 2
A randomized complete block design is used to study the effect of caliper on the measured diameters
Statistical Methods - STAT 581 - Problem Set 1
An experiment is designed to investigate whether the time to drill holes in rock holes using wet or dry drilling.
Statistical Methods - STAT 581 - Problem Set 2
A product developer is investigating the tensile strength of a new synthetic fiber that will be used to make cloth for men’s shirts.
Statistical Methods - STAT 581 - Problem Set 3 a
An experiment is conducted to study the effect of drilling method on drilling time. Each method (dry drilling, wet drilling) is used on $n = 12$ rocks.
Statistical Methods - STAT 581 - Problem Set 3 b
An experiment to compare a new drug to a standard is in the planning stages. The response variable of interest is the clotting time (in minutes) of blood
Statistical Methods - STAT 581 - Problem Set 4
The insulating life of protective fluids at an accelerated load is being studied. The experiment has been performed for four types of fluids, with $n = 5$
Statistical Methods - STAT 581 - Problem Set 5
A factorial experiment is used to develop a nitride etch process on a single wafer plasma etching tool.
Statistical Methods - STAT 581 - Problem Set 6
A soft drink bottler is interested in studying the effects on a filling process. A factorial experiment is run using three factors: percent carbonation (in %),
Statistical Methods - STAT 581 - Problem Set 7
A paired comparisons design is used to study the effect of machine operator on > the measured running time (in secs.) of a fuse. A sample of $n = 10$ fuses is
Statistical Methods - STAT 581 - Problem Set 8
An experiment is designed to test for systematic differences in the hardness > measurements provided by two devices (fixed effect, factor $A$).
Statistical Methods - STAT 581 - Problem Set 9
The surface finish of metal parts made on $a=4$ machines is under > investigation. > Each machine can be run by one of $b=3$ operators.
Computational Statistics - SIUe - STAT 575 - Problem Set 2
This problem set covers the E-M algorithm for right-censored normal data with known variance.
Review: A Symbolic Representation of Time Series, with Implications for Streaming Algorithms
In [1], the authors present a method for constructing a symbolic (nominal) representation for real-valued time series data. A symbolic representation is desirable because then it becomes possible to use many of the effective algorithms that require …
SIUe - Computational Statistics (STAT 575) - Problem Set 4
This problem set covers sampling from a Gamma distribution using Metropolis-Hastings and acceptance-rejection methods.
Bootstrap Methods: When Theory Meets Computational Statistics
Bootstrap methods sit at a beautiful intersection: rigorous statistical theory implemented through brute-force computation.
The Core Idea
The bootstrap is conceptually simple: if you don’t know the sampling distribution of a statistic, …
dfr.dist: Specify the Hazard Function Directly
Most survival analysis forces you to pick from a catalog—Weibull, exponential, log-normal. dfr.dist flips this: you specify the hazard function directly, and it handles all the math.
The Core Insight
Instead of choosing Weibull(shape, scale), you …
Regression Analysis - SIUe - STAT 482 - Probem Set 8
This problem set covers multicollinearity in regression analysis and the marginal and partial effects of predictor variables, among other topics.
STAT 482 - Regression Analysis - SIUe - Fall 2022
This is a problem set for STAT 482 - Regression Analysis at SIUe. These problem sets were given by Dr. Andrew Neath, a professor in the Department of Mathematics and Statistics at Southern Illinois University Edwardsville (SIUe) during the Fall 2022 …
STAT 575 - Computational Statistics - SIUe - Summer 2021
This is a problem set for STAT 575 - Computational Statistics at SIUe. These problem sets were given by Dr. Qiang Beidi, a professor in the Department of Mathematics and Statistics at Southern Illinois University Edwardsville (SIUe) during the Summer …
algebraic.mle: The Foundation of a Statistical Inference Ecosystem
An R package treating MLEs as first-class algebraic objects with composable statistical properties.
Discrete Multivariate Analysis - STAT 579 - Exam 2
Discrete multivariate analysis exam covering log-linear models and categorical data analysis.
Discrete Multivariate Analysis - STAT 579 - Final Exam
Final exam for discrete multivariate analysis.
Discrete Multivariate Analysis - STAT 579 - Problem Set 10
Problem set 10 for discrete multivariate analysis.
Discrete Multivariate Analysis - STAT 579 - Problem Set 5
Problem set 5 for discrete multivariate analysis.
Discrete Multivariate Analysis - STAT 579 - Problem Set 6
Problem set 6 for discrete multivariate analysis.
Discrete Multivariate Analysis - STAT 579 - Problem Set 7
Problem set 7 for discrete multivariate analysis.
Discrete Multivariate Analysis - STAT 579 - Problem Set 8
Problem set 8 for discrete multivariate analysis.
Discrete Multivariate Analysis - STAT 579 - Problem Set 9
Problem set 9 for discrete multivariate analysis.
STAT 478 - Time Series Analysis - SIUe - Spring 2021
Problem sets for STAT 478 - Time Series Analysis at SIUe, taught by Dr. Beidi during Spring 2021.
STAT 579 - Discrete Multivariate Analysis - SIUe - Spring 2021
Problem sets for STAT 579 - Discrete Multivariate Analysis at SIUe, taught by Dr. Andrew Neath during Spring 2021.
Time Series Analysis - STAT 478 - Exam 1
Time series analysis exam covering ARMA processes and model identification.
Time Series Analysis - STAT 478 - Exam 2
Time series analysis coursework.
Time Series Analysis - STAT 478 - Final Exam
Final exam for time series analysis course.
Time Series Analysis - STAT 478 - Problem Set 3
Problem set 3 for time series analysis.
Time Series Analysis - STAT 478 - Problem Set 4
Problem set 4 for time series analysis.
Time Series Analysis - STAT 478 - Problem Set 5
Problem set 5 for time series analysis.
Time Series Analysis - STAT 478 - Problem Set 6
Problem set 6 for time series analysis.
Time Series Analysis - STAT 478 - Project
Time series analysis project.
algebraic.dist: Treating Distributions as First-Class Algebraic Objects in R
Most statistical software treats probability distributions as static parameter sets you pass to sampling or density functions. algebraic.dist takes a different approach: distributions are algebraic objects that compose, transform, and combine using …
hypothesize: API for Hypothesis Testing
Likelihood Model
Algebra over distribution (random elements) objects
Algebraic Maximum Likelihood Estimators
Statistical Computing with R: Building Tools for Inference
One of the best parts of my mathematics degree is deepening my R skills—not just using R packages, but building them.
Why R for Statistical Computing
R has a unique position in statistics:
- Domain-specific: Built for statistics, not adapted to it …
Why I'm Pursuing a Second Master's in Mathematics
I’ve decided to pursue a second master’s degree—this time in Mathematics and Statistics at SIUE.
People ask: “You already have an MS in Computer Science. Why go back?”
The Honest Answer
Computer science gave me tools. …
Dynamic failure rate (DFR) distributions
Reliability Analysis and the Problem of Censored Data
One of the most interesting statistical problems I’ve encountered is reliability analysis with censored data—situations where you know something didn’t fail, but not when it will fail.
The Censoring Problem
Imagine testing light bulbs. …