Introduction
femtograd is an R package for automatic differentiation
(AD) designed for classical statistics. Unlike packages focused on deep
learning, femtograd emphasizes:
- Pedagogy: Clear, readable code that demonstrates how AD works
- Statistics: Built-in support for MLE, Hessians, and statistical inference
-
R integration: Standard R generics
(
coef,vcov,confint, etc.)
This vignette introduces the core concepts and workflow.
Creating Differentiable Values
The foundation of femtograd is the value object, which
represents a node in a computational graph. Create values using
val():
Values can be scalars, vectors, or matrices:
Building Computations
Standard mathematical operations work with value objects and automatically build a computational graph:
x <- val(3)
y <- val(4)
# Build computation
z <- x^2 + x * y
# Access the result
get_data(z)
#> [1] 21Femtograd supports many operations:
-
Arithmetic:
+,-,*,/,^ -
Transcendental:
log,exp,sqrt,sin,cos,tanh -
Special functions:
lgamma,digamma,log1p -
Activations:
sigmoid,relu,softplus,logit
Computing Gradients
Call backward() on any value to compute gradients via
reverse-mode AD (backpropagation). Then use grad() to
access the gradient:
x <- val(3)
y <- val(4)
z <- x^2 + x * y # z = 9 + 12 = 21
# Compute gradients
backward(z)
# Access gradients
grad(x) # dz/dx = 2x + y = 10
#> [1] 10
grad(y) # dz/dy = x = 3
#> [1] 3The gradient represents how much z changes when we
change each input.
Maximum Likelihood Estimation
The primary use case for femtograd is maximum likelihood estimation.
The fit() function combines optimization and inference:
# Generate data from a normal distribution
set.seed(42)
data <- rnorm(100, mean = 5, sd = 2)
# Define log-likelihood with named parameters
# Use log(sigma) to ensure sigma > 0
result <- fit(
function(mu, log_sigma) {
sigma <- exp(log_sigma)
loglik_normal(mu, sigma, data)
},
params = c(mu = 0, log_sigma = 0)
)
result
#> femtofit: Maximum Likelihood Estimation
#>
#> Coefficients:
#> mu log_sigma
#> 5.0650296 0.7286466
#>
#> Log-likelihood: -214.8Using Base R Generics
The fitted model supports standard R generics:
# Parameter estimates
coef(result)
#> mu log_sigma
#> 5.0650296 0.7286466
# Standard errors
se(result)
#> mu log_sigma
#> 0.20722742 0.07071068
# Variance-covariance matrix
vcov(result)
#> mu log_sigma
#> mu 4.294320e-02 -2.117559e-11
#> log_sigma -2.117559e-11 5.000000e-03
# Confidence intervals
confint(result)
#> 2.5% 97.5%
#> mu 4.6588714 5.471188
#> log_sigma 0.5900563 0.867237
# Log-likelihood (enables AIC/BIC)
logLik(result)
#> 'log Lik.' -214.7585 (df=2)
AIC(result)
#> [1] 433.517For a detailed summary with p-values:
summary(result)
#> femtofit: Maximum Likelihood Estimation
#>
#> Call:
#> fit(loglik = function(mu, log_sigma) {
#> sigma <- exp(log_sigma)
#> loglik_normal(mu, sigma, data)
#> }, params = c(mu = 0, log_sigma = 0))
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> mu 5.065030 0.207227 24.442 < 2.2e-16 ***
#> log_sigma 0.728647 0.070711 10.305 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> ---
#> Log-likelihood: -214.8 AIC: 433.5
#> Converged in 22 iterationsBuilt-in Distributions
Femtograd includes log-likelihood functions for common distributions:
| Distribution | Function | Parameters |
|---|---|---|
| Normal | loglik_normal(mu, sigma, x) |
mean, sd |
| Exponential | loglik_exponential(rate, x) |
rate |
| Poisson | loglik_poisson(lambda, x) |
mean |
| Binomial | loglik_binomial(p, n, x) |
probability, trials |
| Gamma | loglik_gamma(shape, rate, x) |
shape, rate |
| Beta | loglik_beta(alpha, beta, x) |
shape1, shape2 |
| Weibull | loglik_weibull(shape, scale, x) |
shape, scale |
| Pareto | loglik_pareto(alpha, x_min, x) |
shape, scale |
Example: Exponential Distribution
# Generate exponential data
set.seed(123)
waiting_times <- rexp(50, rate = 0.5)
# Fit exponential model
exp_fit <- fit(
function(rate) loglik_exponential(rate, waiting_times),
params = c(rate = 1)
)
#> Warning in log(x$data): NaNs produced
#> Warning in log(x$data): NaNs produced
#> Warning in log(x$data): NaNs produced
#> Warning in log(x$data): NaNs produced
#> Warning in log(x$data): NaNs produced
#> Warning in log(x$data): NaNs produced
# True rate was 0.5, MLE should be close to 1/mean
coef(exp_fit)
#> rate
#> 0.4423327
1 / mean(waiting_times)
#> [1] 0.4423327Parameter Constraints
For parameters with constraints (e.g., variance > 0, probability in [0,1]), use transformations:
# sigma > 0: use log(sigma), then exp() in likelihood
# 0 < p < 1: use logit(p), then sigmoid() in likelihood
# Example: Beta distribution with shape constraints (both > 0)
set.seed(456)
beta_data <- rbeta(50, shape1 = 2, shape2 = 5)
beta_fit <- fit(
function(log_alpha, log_beta) {
alpha <- exp(log_alpha) # Ensures alpha > 0
beta <- exp(log_beta) # Ensures beta > 0
loglik_beta(alpha, beta, beta_data)
},
params = c(log_alpha = 0, log_beta = 0)
)
# Transform back to original scale
exp(coef(beta_fit))
#> log_alpha log_beta
#> 2.467469 6.149787Femtograd also provides helper functions: positive(),
probability(), and bounded().
Model Comparison
Compare multiple models using AIC:
# Fit two models: fixed vs estimated sigma
m1 <- fit(function(mu) loglik_normal(mu, 2, data), c(mu = 0))
m2 <- fit(
function(mu, log_sigma) loglik_normal(mu, exp(log_sigma), data),
c(mu = 0, log_sigma = 0)
)
compare(m1, m2, names = c("Fixed sigma", "Free sigma"))
#> Model Comparison (ranked by AIC)
#>
#> Model df logLik AIC BIC dAIC weight
#> Fixed sigma 1 -214.89 431.78 NA 0.00 0.705
#> Free sigma 2 -214.76 433.52 NA 1.74 0.295For nested models, use anova() for likelihood ratio
tests:
anova(m1, m2)
#> Analysis of Deviance Table (Likelihood Ratio Tests)
#>
#> Df logLik Df diff Chisq Pr(>Chisq)
#> m1 1.0 -214.9
#> m2 2.0 -214.8 1 0.258 0.611Next Steps
- See
vignette("computational-graphs")to understand how AD works - See
vignette("inference")for advanced inference (bootstrap, profile likelihood) - See
vignette("survival-analysis")for a complete worked example - See
vignette("architecture")for how AD engines are built and what makes femtograd different from production tools