Skip to main content

symlik: Symbolic Likelihood Models in Python

symlik is a Python library for symbolic likelihood models. Write your log-likelihood as a symbolic expression, and it derives everything needed for inference.

The Problem

Traditional statistical computing gives you two choices:

  1. Manual derivation. Work out score functions and information matrices by hand, then implement them. Error-prone, tedious.
  2. Numerical approximation. Use finite differences. Unstable, slow, no symbolic form to inspect.

The Approach

symlik takes a third path: symbolic differentiation. Define the model once, get exact derivatives automatically.

from symlik.distributions import exponential

model = exponential()
data = {'x': [1.2, 0.8, 2.1, 1.5]}

mle, _ = model.mle(data=data, init={'lambda': 1.0})
se = model.se(mle, data)

print(f"Rate: {mle['lambda']:.3f} +/- {se['lambda']:.3f}")
# Rate: 0.714 +/- 0.357

Behind the scenes, symlik:

  1. Symbolically differentiates the log-likelihood to get the score function
  2. Differentiates again for the Hessian
  3. Computes Fisher information from the Hessian
  4. Derives standard errors from the inverse information matrix

All exact. No numerical approximation.

Custom Models

The real power is defining custom models using s-expressions:

from symlik import LikelihoodModel

# Exponential: l(lambda) = sum[log(lambda) - lambda*x_i]
log_lik = ['sum', 'i', ['len', 'x'],
           ['+', ['log', 'lambda'],
            ['*', -1, ['*', 'lambda', ['@', 'x', 'i']]]]]

model = LikelihoodModel(log_lik, params=['lambda'])

# Symbolic derivatives available
score = model.score()       # Gradient
hess = model.hessian()      # Hessian matrix
info = model.information()  # Fisher information

You define the log-likelihood once as a symbolic expression. symlik computes the rest.

Heterogeneous Data

One of symlik’s strengths is handling mixed observation types, which is exactly what you need for reliability analysis with censored data:

from symlik import ContributionModel
from symlik.contributions import complete_exponential, right_censored_exponential

model = ContributionModel(
    params=["lambda"],
    type_column="status",
    contributions={
        "observed": complete_exponential(),
        "censored": right_censored_exponential(),
    }
)

data = {
    "status": ["observed", "censored", "observed", "observed", "censored"],
    "t": [1.2, 3.0, 0.8, 2.1, 4.5],
}

Each observation type contributes differently to the likelihood. symlik handles the bookkeeping.

Connection to Research

symlik is the Python successor to my R package likelihood.model. It implements the theoretical framework from my thesis work on likelihood-based inference for series systems.

The Weibull Series Model Selection paper shows applications to reliability engineering, the kind of complex likelihood that benefits from symbolic treatment.

Powered by rerum

symlik uses rerum for symbolic differentiation. rerum is a pattern matching and term rewriting library that handles the calculus. The separation means you can use rerum for other symbolic computation tasks beyond likelihood models.

Installation

Available on PyPI:

pip install symlik

Documentation at queelius.github.io/symlik.

See the project page for more details.

Discussion