active library

fisher-flow

Fisher Flow: A unified information-geometric framework for sequential inference revealing how modern optimizers (Adam, Natural Gradient, K-FAC, EWC) emerge as special cases of Fisher information propagation

Started 2025 HTML

Resources & Distribution

Source Code

Package Registries

Publications

Fisher Flow: An Information-Geometric Framework for Sequential Estimation

(2024)

Fisher Flow: Information-Geometric Sequential Inference

Paper License: MIT Python 3.8+

Fisher Flow (FF) is a unified framework for sequential parameter estimation that propagates Fisher information rather than probability distributions. It provides uncertainty quantification with 10-100x speedup over Bayesian methods.

🎯 The Core Insight

Instead of tracking all possible parameter values and their probabilities (expensive!), Fisher Flow tracks just two things:

  1. Your best parameter estimate
  2. The Fisher Information Matrix (how confident you are)

When new data arrives, both update with simple matrix arithmetic—no integration required!

🚀 Key Features

  • Unified Framework: Reveals that Adam, Natural Gradient, and Elastic Weight Consolidation are all special cases
  • Fast Uncertainty: Get confidence intervals without MCMC or variational inference
  • Streaming Ready: Process data sequentially with bounded memory
  • Distributed: Information matrices add across workers
  • Theoretically Grounded: Proven convergence and efficiency guarantees

📊 Quick Example

import numpy as np
from fisher_flow import DiagonalFF, KroneckerFF, FullFF

# Online logistic regression with uncertainty
model = DiagonalFF(dim=784)

for batch in data_stream:
    # Update with new data
    estimate, uncertainty = model.update(batch)
    
    # Get confidence intervals
    ci_lower, ci_upper = model.confidence_interval(0.95)
    
    # Make predictions with uncertainty
    pred_mean, pred_std = model.predict(x_new)

🧠 Why Fisher Flow?

The Problem

Modern ML needs methods that can:

  • Process streaming data efficiently
  • Quantify uncertainty in predictions
  • Scale to billions of parameters
  • Combine information from distributed sources

Bayesian inference handles uncertainty but doesn’t scale. SGD scales but lacks uncertainty.

The Solution

Fisher Flow bridges this gap by propagating Fisher information—a quadratic approximation to the log-posterior curvature.

What’s Actually New?

We didn’t invent new math—we recognized a pattern. Many successful methods are implicitly doing Fisher Flow:

MethodWhat It Actually Is
AdamDiagonal Fisher Flow
Natural GradientFull Fisher Flow
K-FACKronecker Fisher Flow
Elastic Weight ConsolidationFisher Flow with memory
Kalman FilterLinear-Gaussian Fisher Flow

By naming this pattern, we can:

  • Design new algorithms systematically
  • Understand why existing methods work
  • Choose approximations principled

📦 Installation

# From PyPI (coming soon)
pip install fisher-flow

# From source
git clone https://github.com/yourusername/fisher-flow.git
cd fisher-flow
pip install -e .

🎓 The Fisher Flow Family

Choose your approximation based on your needs:

By Structure

  • ScalarFF: One learning rate for all (SGD-like)
  • DiagonalFF: Per-parameter learning rates (Adam-like)
  • BlockFF: Groups share information (layer-wise)
  • KroneckerFF: For matrix parameters (K-FAC-like)
  • FullFF: Complete information matrix (Natural Gradient)

By Memory

  • StationaryFF: Accumulate forever
  • WindowedFF: Recent data only
  • ExponentialFF: Gradual forgetting
  • AdaptiveFF: Detect and adapt to changes

📈 Performance

Benchmark results on standard tasks:

MethodAccuracyCalibration (ECE)Time (s)Memory
SGD75.4%0.0821.2O(d)
Adam76.1%0.0711.8O(d)
Fisher Flow (Diagonal)76.3%0.0482.1O(d)
Fisher Flow (Block)76.8%0.0414.5O(d)
Variational Bayes76.5%0.04545.3O(d²)

🔬 Mathematical Foundation

Fisher Flow updates follow the natural gradient on statistical manifolds:

# Information accumulation
I_t = I_{t-1} + F(batch_t)

# Parameter update  
θ_t = I_t^{-1} (I_{t-1} θ_{t-1} + F(batch_t) θ_batch)

Where F(batch) is the Fisher Information from the batch. This simple update rule:

  • ✅ Is invariant to reparameterization
  • ✅ Achieves Cramér-Rao efficiency bound
  • ✅ Combines information optimally
  • ✅ Scales to streaming settings

📚 Learn More

Accessible Introduction

Technical Deep Dive

Code Examples

🤝 Contributing

We welcome contributions! Fisher Flow is a general pattern with many unexplored variants.

Ideas to Explore

  • Sparse Fisher Flow for high-dimensional models
  • Fisher Flow for graph neural networks
  • Hardware-optimized implementations
  • Fisher Flow for reinforcement learning
  • Non-parametric extensions

See CONTRIBUTING.md for guidelines.

📖 Citation

If you use Fisher Flow in your research, please cite:

@article{towell2025fisherflow,
  title={Fisher Flow: Information-Geometric Sequential Inference},
  author={Towell, Alex},
  journal={arXiv preprint arXiv:2025.xxxxx},
  year={2025}
}

🗺️ Roadmap

Phase 1: Core Library (Current)

  • Basic Fisher Flow implementations
  • Standard benchmarks
  • PyTorch/JAX/TensorFlow backends
  • Documentation and tutorials

Phase 2: Applications

  • Integration with popular ML libraries
  • Uncertainty quantification toolkit
  • Continual learning framework
  • Distributed training support

Phase 3: Extensions

  • Moment propagation beyond Fisher
  • Causal Fisher Flow
  • Fisher Flow for scientific computing
  • AutoML for choosing approximations

💡 The Big Picture

Fisher Flow isn’t just another optimization algorithm—it’s a new lens for understanding learning:

All learning is information propagation with different carriers, metrics, dynamics, and objectives.

This perspective unifies:

  • Supervised learning → Propagate label information to parameters
  • Unsupervised learning → Propagate structure information to representations
  • Meta-learning → Propagate task information to priors
  • Transfer learning → Propagate domain information across tasks

📬 Contact

📄 License

MIT License - see LICENSE file for details.


“Sometimes the biggest contribution isn’t inventing something new—it’s recognizing what’s already there and giving it a name.”

Related Resources

Explore related blog posts, projects, and publications

Publications

Fisher Flow: An Information-Geometric Framework for Sequential Estimation

(2024)

Discussion