Accumux is a framework for combining statistical accumulators using algebraic composition. The idea is simple: accumulators form a monoid under composition, so you can combine them with +, process data in a single pass, and extract all results.
The Problem
Computing multiple statistics over large datasets usually means multiple passes over the data, hand-rolled code combining different algorithms, or numerical instability from naive implementations. Accumux solves this with compositional accumulators.
Quick Example
#include "accumux/accumulators/kbn_sum.hpp"
#include "accumux/accumulators/welford.hpp"
#include "accumux/core/composition.hpp"
using namespace accumux;
// Compose accumulators with +
auto stats = kbn_sum<double>() + welford_accumulator<double>();
// Single pass through data
std::vector<double> data = {1.0, 2.0, 3.0, 4.0, 5.0};
for (const auto& value : data) {
stats += value;
}
// Extract all results
auto sum = stats.get_first().eval(); // 15.0
auto mean = stats.get_second().mean(); // 3.0
auto variance = stats.get_second().sample_variance(); // 2.5
Numerically Stable Algorithms
Accumux uses proven algorithms that maintain accuracy even with ill-conditioned data.
Kahan-Babushka-Neumaier Summation
Standard floating-point summation loses precision:
// Naive sum fails on this
std::vector<double> values = {1.0, 1e100, 1.0, -1e100};
// Naive: 0.0 (wrong!)
// KBN: 2.0 (correct!)
auto summer = kbn_sum<double>();
for (auto v : values) summer += v;
std::cout << summer.eval(); // 2.0
Welford’s Online Algorithm
Computes mean and variance in a single pass without catastrophic cancellation:
auto welford = welford_accumulator<double>();
for (auto v : data) welford += v;
welford.count(); // Number of samples
welford.mean(); // Running mean
welford.sample_variance(); // Unbiased variance
welford.sample_std_dev(); // Standard deviation
Min/Max Tracking
auto minmax = minmax_accumulator<double>();
for (auto v : data) minmax += v;
minmax.min(); // Minimum value
minmax.max(); // Maximum value
Algebraic Composition
The key insight is that accumulators form a monoid under composition.
// Compose arbitrarily many accumulators
auto financial = kbn_sum<double>() +
welford_accumulator<double>() +
minmax_accumulator<double>();
std::vector<double> returns = {0.05, -0.02, 0.03, 0.01, -0.01, 0.04};
for (auto ret : returns) {
financial += ret; // All three update simultaneously
}
// Extract nested results
auto total = financial.get_first().eval();
auto mean = financial.get_second().mean();
auto volatility = financial.get_second().sample_std_dev();
auto worst = financial.get_second().get_second().min();
auto best = financial.get_second().get_second().max();
Mathematical Foundation
Monoid Structure
Each accumulator type A forms a monoid. The identity is the empty accumulator with no observations. The binary operation merges two accumulators (combining their observations).
auto a = welford_accumulator<double>();
auto b = welford_accumulator<double>();
// Process different data
for (auto v : data1) a += v;
for (auto v : data2) b += v;
// Merge results
auto combined = a + b; // Equivalent to processing data1 ++ data2
Homomorphism Property
The composition operation preserves structure:
(a + b).process(x) = a.process(x) + b.process(x)
This enables parallel processing: split data, accumulate in parallel, merge results.
Type Safety with C++20 Concepts
Invalid compositions fail at compile time:
// Compile error: can't add incompatible accumulators
auto invalid = kbn_sum<double>() + kbn_sum<int>(); // Type mismatch!
// OK: compatible types compose
auto valid = kbn_sum<double>() + welford_accumulator<double>();
Use Cases
Financial analysis (track returns, volatility, drawdowns in one pass), scientific computing (online statistics for streaming sensor data), machine learning (feature statistics during data preprocessing), and monitoring systems (real-time metrics aggregation).
Performance
O(1) space per accumulator (constant memory regardless of data size). O(n) time for n data points (single pass). Zero allocations during accumulation. Header-only: no linking, no dependencies.
Installation
Header-only, just include:
#include "accumux/accumulators/kbn_sum.hpp"
#include "accumux/accumulators/welford.hpp"
#include "accumux/core/composition.hpp"
Or with CMake:
add_subdirectory(accumux)
target_link_libraries(your_target PRIVATE accumux::accumux)
Resources
- GitHub: github.com/queelius/accumux
- Paper: AccuMux Technical Report
Discussion