Overview
Simplex v0.8.0 introduces Dual Numbers as a native language type, enabling forward-mode automatic differentiation (AD) at the language level with zero overhead.
Unlike numerical differentiation (which is approximate) or symbolic differentiation (which can be slow and complex), dual numbers compute exact derivatives automatically as your code runs, with the same computational complexity as evaluating the function itself.
How Dual Numbers Work
A dual number extends real numbers with an infinitesimal component. Where a
regular number is just a value x, a dual number is x + x'ε
where ε is an infinitesimal (with the property that ε² = 0).
This elegant mathematical structure encodes the chain rule directly into arithmetic. When you multiply two dual numbers:
(a + a'ε) * (b + b'ε) = ab + (a'b + ab')ε
The result automatically computes the derivative using the product rule! This works for all operations, giving you exact derivatives with no additional effort.
Basic Usage
Creating and using dual numbers is straightforward:
// Create a dual number representing a variable (derivative = 1)
let x: dual = dual::variable(3.0); // value=3, derivative=1
// Operations automatically track derivatives
let y = x * x + x.sin();
// Access value and derivative
println(y.val); // f(3) = 9.1411...
println(y.der); // f'(3) = 6.9899... (exact!)
// Create a constant (derivative = 0)
let c: dual = dual::constant(5.0); // value=5, derivative=0
// Constants don't contribute to derivatives
let z = x * c; // z.der = 5.0 (d/dx of 5x = 5)
Dual Arithmetic
All arithmetic operations automatically propagate derivatives using calculus rules:
| Operation | Formula | Implementation |
|---|---|---|
| Addition | (f+g)' = f' + g' |
Component-wise addition |
| Subtraction | (f-g)' = f' - g' |
Component-wise subtraction |
| Multiplication | (fg)' = f'g + fg' |
Product rule |
| Division | (f/g)' = (f'g - fg')/g² |
Quotient rule |
| Power | (x^n)' = n·x^(n-1) |
Power rule |
let x = dual::variable(2.0);
let y = dual::variable(3.0);
// Product rule: d/dx(x*y) at x=2 = y = 3
let product = x * y;
assert(product.der == 3.0);
// Power rule: d/dx(x^3) at x=2 = 3*2^2 = 12
let cubed = x.pow(3.0);
assert(cubed.der == 12.0);
// Quotient rule: d/dx(1/x) at x=2 = -1/4
let reciprocal = dual::constant(1.0) / x;
assert(reciprocal.der == -0.25);
Transcendental Functions
All common mathematical functions are differentiable:
let x = dual::variable(1.0);
// Trigonometric
x.sin() // d/dx(sin x) = cos x
x.cos() // d/dx(cos x) = -sin x
x.tan() // d/dx(tan x) = sec^2 x
// Exponential and logarithmic
x.exp() // d/dx(e^x) = e^x
x.ln() // d/dx(ln x) = 1/x
x.sqrt() // d/dx(sqrt x) = 1/(2*sqrt x)
// Hyperbolic (for neural networks)
x.tanh() // d/dx(tanh x) = 1 - tanh^2 x
x.sigmoid() // d/dx(sigmoid x) = sigmoid(x) * (1 - sigmoid(x))
Multi-Dimensional Gradients
For functions with multiple inputs, use multidual<N> to compute all partial
derivatives in a single forward pass:
use simplex::diff::{gradient, jacobian};
// Define a function f(x, y) = x^2 + x*y
fn f(x: dual, y: dual) -> dual {
x.pow(2.0) + x * y
}
// Compute gradient at point (2, 3)
let grad = gradient(f, [2.0, 3.0]);
// grad[0] = df/dx = 2x + y = 7
// grad[1] = df/dy = x = 2
// For vector-valued functions, compute Jacobian
fn g(x: dual, y: dual) -> [dual; 2] {
[x * y, x.sin() + y]
}
let jac = jacobian(g, [1.0, 2.0]);
// Returns 2x2 matrix of partial derivatives
Efficiency Note
Forward-mode AD is most efficient when you have few inputs, many outputs. For neural network backpropagation (many inputs, one output), combine with Neural Gates for optimal performance.
Higher-Order Derivatives
Need second derivatives or Hessians? Use dual2 for second-order differentiation:
use simplex::diff::{hessian};
// dual2 tracks value, first derivative, and second derivative
let x: dual2 = dual2::variable(2.0);
let y = x.pow(3.0);
println(y.val); // 8.0 (value)
println(y.der); // 12.0 (first derivative: 3x^2)
println(y.der2); // 12.0 (second derivative: 6x)
// Compute full Hessian matrix
fn f(x: dual2, y: dual2) -> dual2 {
x.pow(2.0) * y + y.pow(3.0)
}
let hess = hessian(f, [1.0, 2.0]);
// Returns 2x2 Hessian matrix of second partial derivatives
Integration with Neural Gates
Dual numbers integrate seamlessly with Neural Gates for training differentiable decision logic:
// Neural gate with dual number input
neural_gate classify(features: dual) -> dual
requires features.val > 0.0
ensures result.val >= 0.0 && result.val <= 1.0
{
features.sigmoid()
}
// Training: derivatives flow through the gate
let x = dual::variable(0.5);
let output = classify(x);
println(output.val); // Prediction
println(output.der); // Gradient for backprop
// Use gradient to update parameters
let loss = (output - target).pow(2.0);
let gradient = loss.der;
Performance
Dual numbers compile to highly efficient code with minimal overhead:
| Operation | Throughput | Overhead vs f64 |
|---|---|---|
| dual add/sub | 500M/sec | ~0% |
| dual mul | 250M/sec | ~2x |
| dual div | 100M/sec | ~2.5x |
| dual sin/cos | 50M/sec | ~2x |
| multidual<10> gradient | 25M/sec | ~10x |
Zero Overhead Abstraction
The overhead is exactly what's expected mathematically—each operation computes both
value and derivative. The compiler applies aggressive optimizations: struct elimination,
chain fusion, and dead derivative elimination (if .der is never read, it's
not computed).