Overview

Simplex v0.8.0 introduces Dual Numbers as a native language type, enabling forward-mode automatic differentiation (AD) at the language level with zero overhead.

Unlike numerical differentiation (which is approximate) or symbolic differentiation (which can be slow and complex), dual numbers compute exact derivatives automatically as your code runs, with the same computational complexity as evaluating the function itself.

Dual Number: Value + Derivative Value f(x) Derivative f'(x) dual { val, der } - computed simultaneously
A dual number carries both value and derivative through every computation

How Dual Numbers Work

A dual number extends real numbers with an infinitesimal component. Where a regular number is just a value x, a dual number is x + x'ε where ε is an infinitesimal (with the property that ε² = 0).

This elegant mathematical structure encodes the chain rule directly into arithmetic. When you multiply two dual numbers:

(a + a'ε) * (b + b'ε) = ab + (a'b + ab')ε

The result automatically computes the derivative using the product rule! This works for all operations, giving you exact derivatives with no additional effort.

Chain Rule Propagation Through Operations x = 3.0 dx = 1.0 x * x product rule val = 9.0 der = 6.0 .sin() chain rule 0.41 -5.47 Input Output
Derivatives propagate automatically through each operation

Basic Usage

Creating and using dual numbers is straightforward:

dual_basics.sx
// Create a dual number representing a variable (derivative = 1)
let x: dual = dual::variable(3.0);  // value=3, derivative=1

// Operations automatically track derivatives
let y = x * x + x.sin();

// Access value and derivative
println(y.val);  // f(3) = 9.1411...
println(y.der);  // f'(3) = 6.9899... (exact!)

// Create a constant (derivative = 0)
let c: dual = dual::constant(5.0);  // value=5, derivative=0

// Constants don't contribute to derivatives
let z = x * c;  // z.der = 5.0 (d/dx of 5x = 5)

Dual Arithmetic

All arithmetic operations automatically propagate derivatives using calculus rules:

Operation Formula Implementation
Addition (f+g)' = f' + g' Component-wise addition
Subtraction (f-g)' = f' - g' Component-wise subtraction
Multiplication (fg)' = f'g + fg' Product rule
Division (f/g)' = (f'g - fg')/g² Quotient rule
Power (x^n)' = n·x^(n-1) Power rule
arithmetic.sx
let x = dual::variable(2.0);
let y = dual::variable(3.0);

// Product rule: d/dx(x*y) at x=2 = y = 3
let product = x * y;
assert(product.der == 3.0);

// Power rule: d/dx(x^3) at x=2 = 3*2^2 = 12
let cubed = x.pow(3.0);
assert(cubed.der == 12.0);

// Quotient rule: d/dx(1/x) at x=2 = -1/4
let reciprocal = dual::constant(1.0) / x;
assert(reciprocal.der == -0.25);

Transcendental Functions

All common mathematical functions are differentiable:

transcendental.sx
let x = dual::variable(1.0);

// Trigonometric
x.sin()   // d/dx(sin x) = cos x
x.cos()   // d/dx(cos x) = -sin x
x.tan()   // d/dx(tan x) = sec^2 x

// Exponential and logarithmic
x.exp()   // d/dx(e^x) = e^x
x.ln()    // d/dx(ln x) = 1/x
x.sqrt()  // d/dx(sqrt x) = 1/(2*sqrt x)

// Hyperbolic (for neural networks)
x.tanh()     // d/dx(tanh x) = 1 - tanh^2 x
x.sigmoid()  // d/dx(sigmoid x) = sigmoid(x) * (1 - sigmoid(x))

Multi-Dimensional Gradients

For functions with multiple inputs, use multidual<N> to compute all partial derivatives in a single forward pass:

gradients.sx
use simplex::diff::{gradient, jacobian};

// Define a function f(x, y) = x^2 + x*y
fn f(x: dual, y: dual) -> dual {
    x.pow(2.0) + x * y
}

// Compute gradient at point (2, 3)
let grad = gradient(f, [2.0, 3.0]);
// grad[0] = df/dx = 2x + y = 7
// grad[1] = df/dy = x = 2

// For vector-valued functions, compute Jacobian
fn g(x: dual, y: dual) -> [dual; 2] {
    [x * y, x.sin() + y]
}

let jac = jacobian(g, [1.0, 2.0]);
// Returns 2x2 matrix of partial derivatives

Efficiency Note

Forward-mode AD is most efficient when you have few inputs, many outputs. For neural network backpropagation (many inputs, one output), combine with Neural Gates for optimal performance.

Higher-Order Derivatives

Need second derivatives or Hessians? Use dual2 for second-order differentiation:

higher_order.sx
use simplex::diff::{hessian};

// dual2 tracks value, first derivative, and second derivative
let x: dual2 = dual2::variable(2.0);

let y = x.pow(3.0);
println(y.val);   // 8.0 (value)
println(y.der);   // 12.0 (first derivative: 3x^2)
println(y.der2);  // 12.0 (second derivative: 6x)

// Compute full Hessian matrix
fn f(x: dual2, y: dual2) -> dual2 {
    x.pow(2.0) * y + y.pow(3.0)
}

let hess = hessian(f, [1.0, 2.0]);
// Returns 2x2 Hessian matrix of second partial derivatives

Integration with Neural Gates

Dual numbers integrate seamlessly with Neural Gates for training differentiable decision logic:

neural_dual.sx
// Neural gate with dual number input
neural_gate classify(features: dual) -> dual
    requires features.val > 0.0
    ensures result.val >= 0.0 && result.val <= 1.0
{
    features.sigmoid()
}

// Training: derivatives flow through the gate
let x = dual::variable(0.5);
let output = classify(x);

println(output.val);  // Prediction
println(output.der);  // Gradient for backprop

// Use gradient to update parameters
let loss = (output - target).pow(2.0);
let gradient = loss.der;

Performance

Dual numbers compile to highly efficient code with minimal overhead:

Operation Throughput Overhead vs f64
dual add/sub 500M/sec ~0%
dual mul 250M/sec ~2x
dual div 100M/sec ~2.5x
dual sin/cos 50M/sec ~2x
multidual<10> gradient 25M/sec ~10x

Zero Overhead Abstraction

The overhead is exactly what's expected mathematically—each operation computes both value and derivative. The compiler applies aggressive optimizations: struct elimination, chain fusion, and dead derivative elimination (if .der is never read, it's not computed).