Dual numbers give you exact derivatives for free. The idea: extend the number system with an element epsilon where epsilon^2 = 0. Evaluate f(x + epsilon) and you get f(x) + f’(x) * epsilon. The derivative falls out of the algebra, no finite differences, no symbolic manipulation.
Quick Start
#include <dual/dual.hpp>
#include <iostream>
int main() {
using namespace dual;
// Create a dual variable at x = 2
auto x = dual<double>::variable(2.0);
// Compute f(x) = x^3 - 3x + 1
auto f = x*x*x - 3.0*x + 1.0;
std::cout << "f(2) = " << f.value() << "\n"; // 3.0
std::cout << "f'(2) = " << f.derivative() << "\n"; // 9.0
}
That’s it. Write your function using overloaded arithmetic, and the derivative computes alongside the value.
Why This Works
A dual number has the form \(a + b\varepsilon\) where \(\varepsilon^2 = 0\). Arithmetic follows naturally:
$$(a + b\varepsilon) + (c + d\varepsilon) = (a+c) + (b+d)\varepsilon$$$$(a + b\varepsilon)(c + d\varepsilon) = ac + (ad + bc)\varepsilon + bd\varepsilon^2 = ac + (ad + bc)\varepsilon$$The \(bd\varepsilon^2\) term vanishes. That’s the whole trick.
For any function \(f\), Taylor expansion gives:
$$f(a + b\varepsilon) = f(a) + bf'(a)\varepsilon + \frac{b^2}{2}f''(a)\varepsilon^2 + \cdots = f(a) + bf'(a)\varepsilon$$All higher-order terms vanish because they contain \(\varepsilon^2\) or higher powers. Set \(b = 1\) (marking x as “the variable we’re differentiating with respect to”):
$$f(x + \varepsilon) = f(x) + f'(x)\varepsilon$$The derivative is the coefficient of epsilon.
API Reference
dual<T>
The core type:
// Create a variable for differentiation
auto x = dual<double>::variable(3.0); // x = 3, dx = 1
// Create a constant
auto c = dual<double>::constant(2.0); // c = 2, dc = 0
// Access values
double val = x.value(); // 3.0
double deriv = x.derivative(); // 1.0
// Arithmetic operators: +, -, *, /
auto y = sin(x*x) + exp(-x);
// Convenience function
auto [value, deriv] = differentiate([](auto x) { return x*x; }, 3.0);
Mathematical Functions
All standard math functions propagate derivatives correctly:
- Basic:
sqrt,cbrt,abs - Exponential:
exp,exp2,expm1,log,log2,log10,log1p - Trigonometric:
sin,cos,tan,asin,acos,atan,atan2 - Hyperbolic:
sinh,cosh,tanh,asinh,acosh,atanh - Power:
pow,hypot - Special:
erf,erfc
Higher-Order Derivatives
Second derivatives with dual2:
auto result = differentiate2([](auto x) { return sin(x); }, 1.0);
// result.value = sin(1)
// result.first = cos(1)
// result.second = -sin(1)
Arbitrary order with jets:
// Compute f, f', f'', f''', f'''' at x = 1
auto derivs = derivatives<4>([](auto x) { return exp(x); }, 1.0);
// All derivatives of e^x at x=1 equal e
Forward vs. Reverse Mode
This library implements forward mode:
- Pros: Simple, no memory overhead, exact
- Cons: Cost is O(n) for n input variables
For functions with many inputs and few outputs (neural network loss functions, for example), reverse mode (backpropagation) is more efficient. For functions with few inputs and many outputs, or for computing Jacobian-vector products, forward mode wins.
| Use Case | Best Mode |
|---|---|
| f: R -> R^n (one input, many outputs) | Forward |
| f: R^n -> R (many inputs, one output) | Reverse |
| Jacobian-vector products | Forward |
| Vector-Jacobian products | Reverse |
Why I Built This
Finite differences are approximate. Symbolic differentiation is brittle and slow. Dual numbers give exact derivatives by overloading arithmetic operators. The chain rule applies automatically through operator composition. Any function written in terms of overloaded operators gets differentiated for free.
The algebraic structure matters here. Dual numbers form a ring, and the derivative’s correctness follows from the ring axioms. This is Stepanov’s idea applied to calculus: the algorithm (differentiation) arises from the algebraic structure (the dual number ring), not from the specific type.
Discussion