What You'll Learn
- Creating learnable parameter schedules
- Using the meta-optimizer
- Automatic learning rate adaptation
- Self-learning annealing for optimization
Prerequisites
Complete Tutorial 13: Dual Numbers first.
Step 1: The Problem with Fixed Schedules
Traditional training requires manually choosing hyperparameters:
// Manual hyperparameter tuning - tedious and error-prone
let initial_lr = 0.001; // Too high? Too low?
let decay_rate = 0.95; // Maybe 0.9 is better?
let temperature = 1.0; // How should this change?
let warmup_steps = 1000; // Guessing...
// Hours of experimentation later...
Step 2: Learnable Schedules
With Self-Learning, all schedule parameters are dual numbers that optimize themselves:
use simplex_training::{LearnableSchedule, MetaOptimizer};
fn main() {
// Create learnable schedule - parameters are dual numbers
let schedule = LearnableSchedule {
initial_lr: dual::variable(0.001),
decay_rate: dual::variable(0.95),
min_lr: dual::variable(0.0001),
warmup_steps: dual::variable(100.0),
};
// The schedule learns optimal values during training!
}
Step 3: Using the Meta-Optimizer
The meta-optimizer updates schedule parameters based on how they affect final loss:
use simplex_training::{MetaOptimizer, LearnableSchedule};
fn objective(params: &[dual]) -> dual {
// Your loss function
compute_loss(params)
}
fn main() {
let schedule = LearnableSchedule::default();
let meta = MetaOptimizer::new(schedule, objective)
.meta_learning_rate(0.01)
.inner_steps(50); // Steps between meta-updates
// Training loop
for epoch in 0..100 {
let result = meta.step(current_params);
println("Epoch {}: loss = {}", epoch, result.loss);
println(" Learned LR: {}", meta.schedule.initial_lr.val);
}
}
How Meta-Gradients Work
The meta-optimizer computes ∂Loss/∂schedule_params. This tells us how changing each schedule parameter affects the final loss, allowing gradient-based optimization of the schedule itself.
Step 4: Self-Learning Annealing
For simulated annealing, use learnable temperature schedules that adapt automatically:
use simplex_training::{Annealer, LearnableTemperature};
fn energy(state: &State) -> dual {
// Your energy function (lower is better)
compute_energy(state)
}
fn main() {
let temp = LearnableTemperature {
initial: dual::variable(10.0),
cool_rate: dual::variable(0.01),
reheat_threshold: dual::variable(50.0), // Steps before reheat
reheat_intensity: dual::variable(2.0),
};
let annealer = Annealer::new(temp);
let solution = annealer.optimize(initial_state, energy, 10000);
println("Final energy: {}", energy(&solution).val);
println("Learned cool rate: {}", annealer.temp.cool_rate.val);
println("Auto re-heats: {}", annealer.stats.reheat_count);
}
Step 5: Full Training Pipeline
Combine multiple learnable components for end-to-end self-optimization:
use simplex_training::{MetaTrainer};
async fn main() {
let trainer = MetaTrainer::new()
.with_learnable_lr() // Self-optimizing learning rate
.with_learnable_pruning() // Automatic pruning schedule
.with_learnable_quantization(); // Adaptive precision
let result = trainer.train(&model, &data).await;
println("Final loss: {}", result.final_loss);
println("Compression: {}x", result.compression_ratio);
println("Learned schedules saved to: {}", result.schedule_path);
}
Exercise
Take the training example from Tutorial 13 and replace the fixed learning
rate with a LearnableSchedule. Compare the final loss and
convergence speed.
Congratulations!
You've completed all 15 Simplex tutorials! You now understand the language from basics through advanced features like Edge Hive and Self-Learning Optimization. Check out the Examples for complete applications.