Your First Model¶

This guide walks through fitting an exponential distribution to data. Along the way, you will see how symlik handles the statistical computations.

The Setup¶

Suppose you have measurements of how long customers wait in a queue:

wait_times = [2.1, 0.8, 1.5, 3.2, 1.1, 0.9, 2.4, 1.8]

The exponential distribution is a natural model for waiting times. Its probability density is:

\[ f(x; \lambda) = \lambda e^{-\lambda x} \]

where \(\lambda\) is the rate parameter (higher means shorter waits).

Fitting the Model¶

from symlik.distributions import exponential

# Create the model
model = exponential()

# Fit to data
data = {'x': [2.1, 0.8, 1.5, 3.2, 1.1, 0.9, 2.4, 1.8]}
fit = model.fit(data=data, init={'lambda': 1.0})

print(f"Estimated rate: {fit.params['lambda']:.4f}")
print(f"Converged: {fit.converged}")

The fit() method uses Newton-Raphson optimization with symbolically-computed derivatives and returns a FittedLikelihoodModel with all results.

Understanding the Output¶

The MLE for an exponential distribution is \(\hat\lambda = 1/\bar{x}\):

import numpy as np
sample_mean = np.mean(data['x'])
print(f"Sample mean: {sample_mean:.4f}")
print(f"1/mean: {1/sample_mean:.4f}")
print(f"MLE: {fit.params['lambda']:.4f}")
# These should match

Standard Errors and Confidence Intervals¶

Standard errors and confidence intervals are available directly from the fitted model:

# Standard error
print(f"SE(lambda): {fit.se['lambda']:.4f}")

# 95% confidence interval
ci = fit.conf_int()
print(f"95% CI: ({ci['lambda'][0]:.3f}, {ci['lambda'][1]:.3f})")

The standard error comes from the observed Fisher information:

\[ \text{SE}(\hat\lambda) = \sqrt{I(\hat\lambda)^{-1}} \]

For the exponential distribution, this works out to \(\hat\lambda / \sqrt{n}\).

Model Fit Statistics¶

The fitted model provides information criteria and other fit statistics:

print(f"Log-likelihood: {fit.llf:.4f}")
print(f"AIC: {fit.aic:.4f}")
print(f"BIC: {fit.bic:.4f}")
print(f"Observations: {fit.nobs}")

Summary Table¶

Get a formatted summary of all results:

print(fit.summary())

What Happened Behind the Scenes¶

When you called model.fit(), symlik:

Differentiated the log-likelihood symbolically to get the score function
Differentiated again to get the Hessian matrix
Iterated Newton-Raphson: \(\theta_{new} = \theta - H^{-1} \nabla \ell\)
Stopped when the score was close to zero

You can inspect these symbolic quantities:

# The score function (should be zero at MLE)
score = model.score()
print(f"Score expression: {score}")

# Evaluate at MLE
score_at_mle = model.score_at({**data, **fit.params})
print(f"Score at MLE: {score_at_mle}")  # Should be ~0

Adding Bounds¶

Some parameters have natural constraints. For rate parameters, we need \(\lambda > 0\):

fit = model.fit(
    data=data,
    init={'lambda': 1.0},
    bounds={'lambda': (0.01, 10.0)}
)

Bounds keep the optimizer in valid parameter space.

Next Steps¶

Learn the s-expression syntax to write your own models
See Fitting Distributions for more distribution examples
Build Custom Models for non-standard likelihoods