The CHAI Philosophy

Cognitive Hive AI (CHAI) is the core architectural philosophy of Simplex. Instead of relying on monolithic large language models (LLMs) for all AI tasks, CHAI orchestrates collections of specialized Small Language Models (SLMs) working together like a hive of specialists.

Key Insight

A team of specialists outperforms a generalist at specific tasks. The same principle applies to language models - a fine-tuned 7B model often beats GPT-4 at its specialty.

The Problem with Monolithic LLMs

Traditional approaches using large language models face several challenges:

Challenge Impact
High Cost $0.03-0.12 per 1K tokens adds up quickly at scale
High Latency 500-3000ms response times hurt user experience
Black Box Complex prompt engineering required, unpredictable behavior
Privacy Concerns Data must leave your infrastructure
Rate Limits API quotas and outages affect availability

The CHAI Solution

CHAI solves these problems through five principles:

  • Specialize: Each model masters a narrow domain (summarization, entity extraction, sentiment, etc.)
  • Collaborate: Models communicate via message passing
  • Scale: Add specialists as needs grow
  • Fail Gracefully: One specialist down doesn't stop the hive
  • Cost Pennies: Run on commodity ARM instances

Per-Hive SLM Architecture

The core architectural decision of CHAI v0.5.0: each hive provisions ONE shared SLM that all its specialists use. This is fundamentally different from giving each specialist its own model.

Per-Hive SLM Architecture: One shared SLM serves multiple specialist Animas (Analyst, Coder, Reviewer) connected through HiveMnemonic shared consciousness

Why Per-Hive, Not Per-Specialist?

Per-Specialist (Old) Per-Hive (CHAI v0.5.0)
10 specialists = 10 models 10 specialists = 1 model
80+ GB RAM required 8-12 GB RAM total
Expensive, wasteful Efficient, practical
No shared consciousness HiveMnemonic creates collective knowledge

Key Insight

Each specialist has its own Anima (personal memories and beliefs), but all specialists share the Hive SLM for inference. The HiveMnemonic provides shared consciousness - what one specialist learns, all can access.

Core Constructs

Specialists

A specialist is an actor that wraps a small language model fine-tuned for a specific task:

entity-extractor.sx
specialist EntityExtractor {
    model: "ner-fine-tuned-7b",
    domain: "named entity extraction",
    memory: 8.GB,
    temperature: 0.1,
    max_tokens: 500,

    receive Extract(text: String) -> List<Entity> {
        let raw = infer("Extract all named entities from: {text}")
        parse_entities(raw)
    }
}

The infer Primitive

Inside a specialist, the infer function calls the underlying model:

infer-examples.sx
// Basic inference
let result = infer(prompt)

// With parameters
let result = infer(prompt, temperature: 0.7, max_tokens: 200)

// Streaming
for chunk in infer_stream(prompt) {
    emit(chunk)
}

// Typed extraction
let data = infer_typed<Person>(prompt)

Hives

A hive is a supervisor for specialists with a shared SLM and collective memory:

document-hive.sx
hive DocumentProcessor {
    // Specialists in this hive
    specialists: [
        Summarizer,
        EntityExtractor,
        SentimentAnalyzer,
        TopicClassifier
    ],

    // Shared SLM for all specialists (v0.5.0)
    slm: "simplex-cognitive-7b",

    // Shared consciousness across specialists
    mnemonic: {
        episodic: { capacity: 1000, importance_threshold: 0.4 },
        semantic: { capacity: 5000 },
        beliefs: { revision_threshold: 50 },  // 50% for hive beliefs
    },

    // How tasks are routed to specialists
    router: SemanticRouter(
        embedding_model: "simplex-mnemonic-embed",
        fallback: Summarizer
    ),

    strategy: OneForOne,
}

Routing Strategies

CHAI supports multiple routing strategies to direct requests to the right specialist:

Semantic Router

Uses embedding similarity to match requests with specialist domains. Best for natural language queries.

Rule Router

Pattern matching with explicit rules. Best for structured inputs with clear categories.

LLM Router

A small model decides which specialist to invoke. Best for complex, ambiguous requests.

Cascade Router

Try specialists in order until one succeeds. Best for fallback scenarios.

Ensemble Patterns

Combine multiple specialists for better results:

ensembles.sx
// Parallel - all specialists work simultaneously
let results = await parallel(
    ask(summarizer, Summarize(doc)),
    ask(extractor, Extract(doc)),
    ask(classifier, Classify(doc))
)

// Voting - multiple specialists vote on a decision
let verdict = await vote(
    [judge1, judge2, judge3],
    Evaluate(submission),
    threshold: 0.6
)

// Chain - sequential pipeline processing
let result = doc
    |> ask(cleaner, Clean)
    |> ask(translator, Translate(to: "en"))
    |> ask(summarizer, Summarize)

HiveMnemonic: Shared Consciousness

The HiveMnemonic is the shared memory layer that creates collective consciousness across all specialists in a hive. Unlike traditional RAG systems that rely on vector similarity search, the HiveMnemonic integrates directly with each specialist's Anima to form a unified cognitive substrate.

What One Learns, All Know

When a specialist learns something new, it can contribute that knowledge to the HiveMnemonic. Other specialists automatically benefit from this shared knowledge on their next inference.

Contributing to Shared Memory

mnemonic.sx
specialist Researcher {
    receive Research(topic: String) -> Findings {
        let findings = do_research(topic)

        // Personal memory (my Anima only)
        self.anima.remember("I researched: {topic}")

        // Shared memory (HiveMnemonic - all specialists can access)
        hive.mnemonic.learn("Research finding: {findings.summary}")
        hive.mnemonic.believe(
            "Topic {topic} is well-documented",
            confidence: 80
        )

        findings
    }
}

specialist Synthesizer {
    receive Synthesize(query: String) -> Report {
        // Recall from shared HiveMnemonic
        let team_knowledge = hive.mnemonic.recall_for(query)

        // Recall from personal Anima
        let my_experience = self.anima.recall_for(query)

        // Both inform the inference to the shared Hive SLM
        infer("Create synthesis report for: {query}")
    }
}

How Context Flows to the SLM

When a specialist calls infer(), context is automatically assembled from both personal and shared memory:

  1. Personal context - The specialist's Anima memories are formatted
  2. Shared context - The Hive's Mnemonic is added
  3. Combined prompt - Both contexts prepended to the prompt
  4. Inference - Sent to the shared Hive SLM

Three-Tier Memory Hierarchy

CHAI implements a three-level memory system with different belief thresholds at each level:

Three-Tier Memory Hierarchy: Specialist Anima (30% threshold) flows to Hive Mnemonic (50% threshold) flows to Divine (70% threshold)

Belief Thresholds

Different levels require different amounts of evidence to revise beliefs:

Level Threshold Purpose
Anima (Individual) 30% Flexible personal beliefs, quick to adapt
Mnemonic (Hive) 50% Shared beliefs require consensus
Divine (Global) 70% Organization-wide truths, high confidence required

Belief Propagation

A belief held by an individual Anima at high confidence can be promoted to the HiveMnemonic. Similarly, hive beliefs can propagate to Divine level when multiple hives reach consensus. This creates emergent organizational knowledge from individual learning.

Cost Analysis

CHAI dramatically reduces AI costs compared to external APIs:

Configuration Monthly Cost
Small hive (5 specialists, CPU) ~$35/month
Medium hive (10 specialists, CPU) ~$85/month
High-performance (10 specialists, GPU) ~$1,200/month

Compared to GPT-4 API:

Requests/Month CHAI Cost API Cost Savings
100K $35 $300 88%
1M $85 $3,000 97%
10M $1,200 $30,000 96%

Naming Conventions

CHAI offers two naming traditions for specialists:

Elvish (Poetic)

Sindarin/Quenya names for an organic feel: Isto (Knowledge), Penna (Storyteller), Curu (Craft), Silma (Clarity)

Latin (Technical)

Classical names for a formal feel: Cogito (Think), Scribo (Write), Lego (Read), Faber (Craftsman)

Next Steps