BFGS quasi-Newton optimizer — bfgs • femtograd

Finds the optimum using the BFGS algorithm, which approximates the inverse Hessian using gradient information. More efficient than Newton-Raphson when Hessian computation is expensive.

Usage

bfgs(
  objective_fn,
  params,
  max_iter = 1000,
  tol = 1e-06,
  maximize = FALSE,
  line_search_fn = NULL,
  verbose = 0
)

Arguments

objective_fn: Function taking list of value parameters, returns scalar
params: List of value objects (initial parameter values)
max_iter: Maximum iterations, default 1000
tol: Convergence tolerance on gradient norm, default 1e-6
maximize: If TRUE, maximize; if FALSE (default), minimize
line_search_fn: Line search function (default: backtracking)
verbose: Print progress every N iterations (0 for silent)

Value

A list containing:

params: List of value objects at optimum
value: Objective function value at optimum
gradient: Gradient at optimum
inv_hessian: Approximate inverse Hessian
iterations: Number of iterations performed
converged: TRUE if gradient norm < tol

Details

BFGS maintains an approximation B to the inverse Hessian, updated as: B_k+1 = (I - ρsy') B_k (I - ρys') + ρss' where s = x_k+1 - x_k, y = g_k+1 - g_k, ρ = 1/(y'*s)

The search direction is d = -B*g, followed by line search.

Examples

if (FALSE) { # \dontrun{
# Minimize Rosenbrock function
rosenbrock <- function(p) {
  x <- p[[1]]; y <- p[[2]]
  (1 - x)^2 + 100 * (y - x^2)^2
}
result <- bfgs(rosenbrock, list(val(-1), val(-1)))
sapply(result$params, get_data)  # Should be close to c(1, 1)
} # }