What You'll Learn

  • Creating learnable parameter schedules
  • Using the meta-optimizer
  • Automatic learning rate adaptation
  • Self-learning annealing for optimization

Prerequisites

Complete Tutorial 13: Dual Numbers first.

Step 1: The Problem with Fixed Schedules

Traditional training requires manually choosing hyperparameters:

fixed_schedule.sx
// Manual hyperparameter tuning - tedious and error-prone
let initial_lr = 0.001;      // Too high? Too low?
let decay_rate = 0.95;       // Maybe 0.9 is better?
let temperature = 1.0;       // How should this change?
let warmup_steps = 1000;     // Guessing...

// Hours of experimentation later...

Step 2: Learnable Schedules

With Self-Learning, all schedule parameters are dual numbers that optimize themselves:

learnable_schedule.sx
use simplex_training::{LearnableSchedule, MetaOptimizer};

fn main() {
    // Create learnable schedule - parameters are dual numbers
    let schedule = LearnableSchedule {
        initial_lr: dual::variable(0.001),
        decay_rate: dual::variable(0.95),
        min_lr: dual::variable(0.0001),
        warmup_steps: dual::variable(100.0),
    };

    // The schedule learns optimal values during training!
}

Step 3: Using the Meta-Optimizer

The meta-optimizer updates schedule parameters based on how they affect final loss:

meta_optimizer.sx
use simplex_training::{MetaOptimizer, LearnableSchedule};

fn objective(params: &[dual]) -> dual {
    // Your loss function
    compute_loss(params)
}

fn main() {
    let schedule = LearnableSchedule::default();

    let meta = MetaOptimizer::new(schedule, objective)
        .meta_learning_rate(0.01)
        .inner_steps(50);  // Steps between meta-updates

    // Training loop
    for epoch in 0..100 {
        let result = meta.step(current_params);

        println("Epoch {}: loss = {}", epoch, result.loss);
        println("  Learned LR: {}", meta.schedule.initial_lr.val);
    }
}

How Meta-Gradients Work

The meta-optimizer computes ∂Loss/∂schedule_params. This tells us how changing each schedule parameter affects the final loss, allowing gradient-based optimization of the schedule itself.

Step 4: Self-Learning Annealing

For simulated annealing, use learnable temperature schedules that adapt automatically:

annealing.sx
use simplex_training::{Annealer, LearnableTemperature};

fn energy(state: &State) -> dual {
    // Your energy function (lower is better)
    compute_energy(state)
}

fn main() {
    let temp = LearnableTemperature {
        initial: dual::variable(10.0),
        cool_rate: dual::variable(0.01),
        reheat_threshold: dual::variable(50.0),  // Steps before reheat
        reheat_intensity: dual::variable(2.0),
    };

    let annealer = Annealer::new(temp);
    let solution = annealer.optimize(initial_state, energy, 10000);

    println("Final energy: {}", energy(&solution).val);
    println("Learned cool rate: {}", annealer.temp.cool_rate.val);
    println("Auto re-heats: {}", annealer.stats.reheat_count);
}

Step 5: Full Training Pipeline

Combine multiple learnable components for end-to-end self-optimization:

full_pipeline.sx
use simplex_training::{MetaTrainer};

async fn main() {
    let trainer = MetaTrainer::new()
        .with_learnable_lr()           // Self-optimizing learning rate
        .with_learnable_pruning()      // Automatic pruning schedule
        .with_learnable_quantization(); // Adaptive precision

    let result = trainer.train(&model, &data).await;

    println("Final loss: {}", result.final_loss);
    println("Compression: {}x", result.compression_ratio);
    println("Learned schedules saved to: {}", result.schedule_path);
}

Exercise

Take the training example from Tutorial 13 and replace the fixed learning rate with a LearnableSchedule. Compare the final loss and convergence speed.

Congratulations!

You've completed all 15 Simplex tutorials! You now understand the language from basics through advanced features like Edge Hive and Self-Learning Optimization. Check out the Examples for complete applications.