v0.9.0 - Edge Intelligence

Released January 11, 2026

The "Edge Intelligence" release brings two major features: Edge Hive for local-first AI processing on any device, and Self-Learning Annealing for meta-gradient optimization of hyperparameters.

Edge Hive - Local-First AI

Edge Hive brings the Cognitive Hive architecture to edge devices—from smartwatches to desktops. All AI processing happens locally with zero cloud dependency, featuring end-to-end encryption and device-adaptive model selection.

Cloud Processing Local Device - Edge Hive Smartwatch Pico-SLM Phone Nano-SLM Tablet Micro-SLM Laptop Mini-SLM Desktop Full-SLM AES-256-GCM Encryption PBKDF2 | TLS 1.3 | HMAC-SHA256 Complete Offline Support No cloud dependency required

Key Features

  • Device-Adaptive Models - Automatic SLM selection based on device capabilities (Pico to Full)
  • End-to-End Encryption - AES-256-GCM for all data, PBKDF2 key derivation, TLS 1.3 transport
  • Complete Offline Support - All processing happens locally, no network required
  • Specialist Types - Local, Shared, Federated, and Sync specialists for different use cases

Self-Learning Annealing

Traditional simulated annealing requires manual tuning of temperature schedules and acceptance thresholds. Self-Learning Annealing uses meta-gradients to learn these hyperparameters during training.

Fixed Schedule (Traditional) τ(t) = τ₀ × 0.95^t Learned Schedule (Simplex) τ(t) = f_θ(t, loss, grad)

Key Features

  • Learnable Temperature - MLP-based schedule that adapts to problem structure
  • Soft Acceptance - Differentiable sigmoid-based acceptance replaces hard thresholds
  • Meta-Optimizer - Outer optimizer (Adam) trains the schedule parameters
  • Convergence Guarantee - Mathematical proof of asymptotic optimality
self_learning.sx
use simplex_training::{LearnableSchedule, MetaOptimizer, SoftAcceptance};

// Create learnable temperature schedule
let schedule = LearnableSchedule::new()
    .initial_temp(10.0)
    .min_temp(0.01)
    .hidden_dim(32);

// Meta-optimizer trains the schedule
let meta = MetaOptimizer::adam(0.001)
    .train(schedule, problem);

// Temperature adapts to problem structure
let temp = schedule.temperature(step, loss, grad);

v0.8.0 - Dual Numbers

Released January 10, 2026

Native forward-mode automatic differentiation with zero runtime overhead. The dual type computes exact gradients alongside values, enabling efficient training of neural gates and optimization.

Unlike reverse-mode AD (backpropagation), forward-mode computes gradients in a single forward pass. This is more efficient for functions with few inputs and many outputs—common in Simplex's streaming architecture where specialists process continuous input streams.

Dual Number x + εx' value + ε × derivative Chain Rule (Automatic) f(g(x))' = f'(g(x)) · g'(x) Computed in single forward pass

Key Features

  • Zero Overhead - Dual arithmetic compiles to simple struct operations
  • Exact Gradients - No approximation errors from numerical differentiation
  • Composable - Works with all Simplex types and neural gates
  • Streaming Friendly - Ideal for continuous optimization in cognitive agents
dual_numbers.sx
use simplex_training::dual;

// Create dual number: value 3.0, derivative 1.0
let x = dual::new(3.0, 1.0);

// Compute f(x) = x² + 2x with automatic derivative
let f = x * x + 2.0 * x;

println("f(3) = {}", f.value);      // 15.0
println("f'(3) = {}", f.derivative); // 8.0 (2x + 2 at x=3)

v0.7.0 - Real-Time Continuous Learning

Released January 9, 2026

The simplex-learning library enables AI specialists to learn and adapt during runtime without requiring offline batch training. This completes the vision of truly adaptive cognitive agents.

Key Features

  • Online Learning - Specialists adapt in real-time from user feedback
  • Tensor Operations with Autograd - Full automatic differentiation support
  • Streaming Optimizers - SGD, Adam, AdamW with gradient accumulation
  • Safety Constraints - Fallback strategies prevent learning instability
  • Federated Learning - Distributed training across hives with 6 aggregation strategies
  • Knowledge Distillation - Teacher-student and self-distillation support
  • Belief Conflict Resolution - Reconcile beliefs across distributed specialists
online_learning.sx
use simplex_learning::{OnlineLearner, StreamingAdam, SafeFallback};

// Create an online learner
let learner = OnlineLearner::new(model_params)
    .optimizer(StreamingAdam::new(0.001))
    .constraint(MaxLatency(10.0))
    .fallback(SafeFallback::with_default(default_output));

// Learn from each interaction
for (input, feedback) in interactions {
    let output = learner.forward(&input);
    learner.learn(&feedback);  // Adapts in real-time
}

Architecture

┌─────────────────────────────────────────────────────────────┐ │ simplex-learning │ ├─────────────┬─────────────┬──────────────┬─────────────────┤ │ tensor/ │ optim/ │ safety/ │ distributed/ │ ├─────────────┼─────────────┼──────────────┼─────────────────┤ │ Tensor │ StreamingSGD│ SafetyBounds │ FederatedLearner│ │ Shape │ StreamingAdam│ Constraints │ KnowledgeDistill│ │ Autograd │ AdamW │ SafeFallback │ BeliefResolver │ │ Ops │ Schedulers │ SafeLearner │ HiveCoordinator │ └─────────────┴─────────────┴──────────────┴─────────────────┘ │ ▼ ┌─────────────────┐ │ runtime/ │ │ OnlineLearner │ │ Checkpoint │ │ Metrics │ └─────────────────┘

v0.6.0 - Neural IR & Neural Gates

Released January 8, 2026

Neural IR transforms Simplex programs into differentiable computation graphs. Programs become learnable, adapting their control flow through training rather than manual tuning.

Inspired by NIR (Neuromorphic Intermediate Representation), which standardizes neuromorphic computing across hardware platforms, Simplex's Neural IR brings similar principles to general-purpose programming: a unified representation that supports both discrete execution and continuous optimization.

The Four Pillars

  • Differentiable Execution - Operations maintain computable gradients throughout the call stack
  • Soft Logic - Boolean decisions return continuous probability values (0.0-1.0)
  • Learnable Parameters - Decision thresholds optimize automatically from training data
  • End-to-End Optimization - Entire programs improve through backpropagation

Mathematical Foundations

The Differentiability Problem

Standard discrete branches (if-else) have zero gradient. You cannot backpropagate through a hard conditional because the derivative of a step function is zero everywhere except at the discontinuity, where it's undefined.

Hard Conditional (non-differentiable):
    f(x) = { A  if x > θ
           { B  otherwise

    df/dx = 0 everywhere (no gradient signal)

Soft Conditional (differentiable):
    f(x) = σ((x - θ) / τ) · A + (1 - σ((x - θ) / τ)) · B

    where σ(z) = 1 / (1 + e^(-z))  (sigmoid)
          τ = temperature (anneals from high to low)

    df/dx = σ'(...) · (A - B) / τ  (gradient flows!)

Gumbel-Softmax Relaxation

For categorical choices (selecting from N options), we use the Gumbel-Softmax trick. This provides a differentiable approximation to sampling from a categorical distribution.

Categorical Selection (hard):
    y = one_hot(argmax(π))     // Non-differentiable

Gumbel-Softmax (soft):
    g_i ~ Gumbel(0, 1)         // Sample Gumbel noise
    y_i = exp((log(π_i) + g_i) / τ) / Σ_j exp((log(π_j) + g_j) / τ)

Temperature Annealing:
    τ_t = τ_max · (τ_min / τ_max)^(t / T)

    As τ → 0: soft samples → hard samples
    As τ → ∞: soft samples → uniform distribution

Soft Logic Operators

Boolean operations transform into continuous approximations that preserve gradient flow:

Soft AND:
    a ∧ b ≈ min(a, b)           // Gödel t-norm
    a ∧ b ≈ a · b               // Product t-norm

Soft OR:
    a ∨ b ≈ max(a, b)           // Gödel t-conorm
    a ∨ b ≈ a + b - a · b       // Probabilistic sum

Soft NOT:
    ¬a ≈ 1 - a

Soft Threshold:
    (x > θ) ≈ σ((x - θ) / τ)    // Sigmoid approximation

Straight-Through Estimator (STE)

For hard constraints where continuous relaxation isn't appropriate, we use the Straight-Through Estimator: forward pass uses hard values, backward pass uses soft gradients.

Forward: y = round(x)         // Hard quantization
Backward: dy/dx = 1           // Gradient passes through unchanged

This allows learning even when the forward pass is non-differentiable.

IR Primitives

Neural IR defines a set of computational primitives that form the building blocks of differentiable programs. These are inspired by NIR's neuromorphic primitives but adapted for general-purpose computing.

Affine

W·x + b linear transformation

SoftGate

Sigmoid-gated conditional

CategoricalGate

Gumbel-Softmax selector

Threshold

Learnable activation threshold

Attention

Soft routing via dot-product

Embedding

Learned vector lookup

Accumulator

Stateful sum with decay

Delay

Temporal offset τ

WeightedRef

Probabilistic pointer

Checkpoint

State snapshot for branching

Contract

Verified confidence bound

Prune

Dead path elimination marker

Graph Representation

Like NIR, Neural IR represents programs as directed graphs where nodes are primitives and edges denote data flow. The graph supports cycles (for loops and recursion) and can be partitioned across heterogeneous hardware.

Neural IR Graph showing computation flow from Input through Embedding, SoftGate, branching paths, Attention, and Output

Neural Gates

Neural Gates are the core language feature that enables learnable control flow. They replace hardcoded conditionals with differentiable decision points.

neural_gates.sx
// Define a neural gate - compiles differently for training vs inference
neural_gate should_retry(confidence: f64) -> Bool {
    confidence > 0.7
}

// Training mode compilation (--mode=train):
//   sigmoid((confidence - 0.7) * temperature)
//   Temperature anneals: 10.0 → 0.1 over training

// Inference mode compilation (--mode=infer):
//   confidence > 0.7  (discrete, zero overhead)

// Categorical neural gate - selects from N options
neural_gate route_request(query: Embedding) -> Specialist {
    match classify(query) {
        Category::Technical => Specialist::Engineer,
        Category::Creative => Specialist::Designer,
        Category::Business => Specialist::Analyst,
    }
}

// Gumbel-Softmax makes this differentiable during training
// Returns weighted combination of paths, not hard selection

Compilation Modes

Mode Behavior Use Case
--mode=train Soft gates, gradient tracking, temperature annealing Training and optimization
--mode=infer Hard gates, no gradients, zero overhead Production deployment
--mode=profile Hard gates with activation statistics collection Pruning analysis

Contract Verification

Soft logic loses predictability. If a gate is "85% true," how do you verify correctness? Neural IR introduces Contract Logic with confidence bounds.

contracts.sx
// Gate with verification contracts
neural_gate memory_safe_path(analysis: SecurityAnalysis) -> Bool
    requires analysis.confidence > 0.95    // Must exceed 95%
    ensures result => no_buffer_overflow   // Guarantee if true
    fallback safe_default_path()           // If confidence too low
{
    analysis.is_safe
}

// Contract types:
//   requires  - Pre-conditions (minimum confidence thresholds)
//   ensures   - Post-conditions guaranteed when gate fires
//   invariant - Properties across gate transitions
//   fallback  - Handler when confidence below threshold

// Verification modes:
//   Static    - Prove bounds at compile time via abstract interpretation
//   Dynamic   - Runtime confidence checks with graceful degradation
//   Monte Carlo - Statistical verification for complex compositions

Belief Thresholds in Hive Architecture

Contract thresholds align with existing Simplex belief levels:
Anima: 30% - Individual beliefs, flexible
Mnemonic: 50% - Shared beliefs, consensus required
Divine: 70% - Global beliefs, high confidence

Hardware-Aware Targeting

CPUs excel at branching; GPUs/TPUs excel at tensor operations. Neural IR automatically partitions the computation graph across heterogeneous hardware.

Hardware-Aware Targeting: Anima Graph Analysis flows through Graph Partitioner which separates Hard Gates (CPU), Neural Gates (GPU/TPU), and Memory Gates (NPU)
targeting.sx
// Explicit hardware targeting
@gpu
neural_gate batch_classifier(inputs: List<Embedding>) -> List<Label> {
    // Runs on GPU - batch tensor operations
    inputs.map(e => classify_embedding(e))
}

@cpu
fn process_result(label: Label) -> Action {
    // Runs on CPU - branching logic
    match label {
        Label::Urgent => Action::Escalate,
        Label::Normal => Action::Queue,
        _ => Action::Log
    }
}

@npu
fn cognitive_inference(context: Context) -> Response {
    // Runs on NPU - SLM inference
    infer("Generate response for: {context}")
}

// Automatic targeting (compiler analyzes and decides)
neural_gate smart_router(query: String) -> Specialist {
    // Compiler detects: embedding lookup + softmax = GPU
}

Superposition Memory Model

If a gate is 50% true and 50% false, does the program allocate memory for both branches? Neural IR defines explicit semantics for weighted pointers and lazy branching.

memory.sx
// WeightedRef type - reference with probability
type WeightedRef<T> = {
    ptr: *T,
    weight: f64,        // 0.0 to 1.0
    allocated: Bool,
}

// 1. Lazy Evaluation (default) - allocate only dominant path
let result = match branch_selector(x) {
    A => compute_a(),  // Only if P(A) > lazy_threshold
    B => compute_b(),
}

// 2. Speculative Execution - allocate all, weight results
@speculative
let result = match branch_selector(x) {
    A => compute_a(),  // All allocated
    B => compute_b(),  // Low-weight paths GC'd later
}

// 3. Checkpoint-Restore - snapshot state, explore, restore
@checkpoint
let result = match branch_selector(x) {
    A => { checkpoint(); compute_a() },
    B => { restore(); compute_b() },
}
Mode Memory Behavior Use Case
Lazy Allocate only dominant path Production inference
Speculative Allocate all, weight, GC Training with memory budget
Checkpoint Snapshot/restore Exact gradient computation
Pooled Pre-allocate max, reuse Real-time systems

Future Roadmap

With v0.9.0 shipped, Simplex continues toward its 1.0 production release. Here's what's coming next:

0.10

GPU Acceleration

Q1 2026

Hardware acceleration for tensor operations and SLM inference on GPU.

  • simplex-training-gpu package with CUDA and Metal backends
  • Automatic GPU memory management with gradient checkpointing
  • Mixed-precision training (FP16/BF16) for 2x training speedup
  • Multi-GPU support with data and model parallelism
  • NPU targeting for edge devices with neural accelerators
0.11

Distributed Hive Clustering

Q2 2026

True distributed computing across network nodes with fault tolerance.

  • MPI/NCCL integration for high-bandwidth federated learning
  • Cross-node hive coordination with Raft consensus
  • Network-aware gradient compression (up to 100x reduction)
  • Fault-tolerant distributed checkpointing with automatic recovery
  • Geographic-aware routing for latency optimization
0.12

Advanced Tooling

Q2 2026

Developer experience improvements and debugging tools.

  • VS Code and IntelliJ IDE plugins with syntax highlighting and completion
  • Interactive debugger with step-through neural gate visualization
  • Performance profiler with gradient flow analysis
  • REPL with hot-reload support for rapid prototyping
  • Language server protocol (LSP) implementation
1.0

Production Release

Q3 2026

Stable release with comprehensive documentation, enterprise features, and ecosystem maturity.

  • API stability guarantees with semantic versioning
  • Complete documentation, tutorials, and architecture guides
  • Production deployment guides for cloud and edge
  • Performance benchmarks and optimization guides
  • Enterprise support options and consulting services
  • Certified model zoo with pre-trained specialists

Feature Integration

Each release builds on previous foundations. Here's how the v0.6.0 through v0.9.0 features connect to create a complete AI-native programming platform:

Feature Foundation Enables
Neural IR (v0.6.0) Differentiable computation graphs Learnable control flow, neural gates
Real-Time Learning (v0.7.0) Neural IR gradients Online adaptation, federated learning
Dual Numbers (v0.8.0) Forward-mode AD Efficient streaming gradients
Edge Hive (v0.9.0) Dual numbers + learning On-device AI with privacy
Self-Learning (v0.9.0) Dual numbers + meta-optimization Auto-tuning hyperparameters

Backward Compatibility

Existing Simplex programs remain fully compatible. Neural IR extends rather than replaces the language. The neural_gate keyword is optional - traditional if-else continues to work as before. Adopt learnable gates only where beneficial.

References

  • NIR - Neuromorphic Intermediate Representation (Nature Communications, 2024)
  • Gumbel-Softmax - Jang et al., "Categorical Reparameterization with Gumbel-Softmax" (ICLR 2017)
  • Straight-Through Estimator - Bengio et al., "Estimating or Propagating Gradients Through Stochastic Neurons" (2013)
  • Differentiable Programming - Innes et al., "A Differentiable Programming System" (2019)
  • Enzyme - Automatic differentiation for LLVM
  • Simplex Anima - Tutorial 9: Anima & Memory
  • Simplex CHAI - Cognitive Hive AI Architecture

Next Steps