The CHAI Philosophy

Cognitive Hive AI (CHAI) is the core architectural philosophy of Simplex. Instead of relying on monolithic large language models (LLMs) for all AI tasks, CHAI orchestrates collections of specialized Small Language Models (SLMs) working together like a hive of specialists.

Key Insight

A team of specialists outperforms a generalist at specific tasks. The same principle applies to language models - a fine-tuned 7B model often beats GPT-4 at its specialty.

The Problem with Monolithic LLMs

Traditional approaches using large language models face several challenges:

Challenge Impact
High Cost $0.03-0.12 per 1K tokens adds up quickly at scale
High Latency 500-3000ms response times hurt user experience
Black Box Complex prompt engineering required, unpredictable behavior
Privacy Concerns Data must leave your infrastructure
Rate Limits API quotas and outages affect availability

The CHAI Solution

CHAI solves these problems through five principles:

  • Specialize: Each model masters a narrow domain (summarization, entity extraction, sentiment, etc.)
  • Collaborate: Models communicate via message passing
  • Scale: Add specialists as needs grow
  • Fail Gracefully: One specialist down doesn't stop the hive
  • Cost Pennies: Run on commodity ARM instances

Core Constructs

Specialists

A specialist is an actor that wraps a small language model fine-tuned for a specific task:

entity-extractor.sx
specialist EntityExtractor {
    model: "ner-fine-tuned-7b",
    domain: "named entity extraction",
    memory: 8.GB,
    temperature: 0.1,
    max_tokens: 500,

    receive Extract(text: String) -> List<Entity> {
        let raw = infer("Extract all named entities from: {text}")
        parse_entities(raw)
    }
}

The infer Primitive

Inside a specialist, the infer function calls the underlying model:

infer-examples.sx
// Basic inference
let result = infer(prompt)

// With parameters
let result = infer(prompt, temperature: 0.7, max_tokens: 200)

// Streaming
for chunk in infer_stream(prompt) {
    emit(chunk)
}

// Typed extraction
let data = infer_typed<Person>(prompt)

Hives

A hive is a supervisor for specialists with intelligent routing:

document-hive.sx
hive DocumentProcessor {
    specialists: [
        Summarizer,
        EntityExtractor,
        SentimentAnalyzer,
        TopicClassifier
    ],

    router: SemanticRouter(
        embedding_model: "all-minilm-l6-v2"
    ),

    strategy: OneForOne,

    memory: SharedVectorStore(dimension: 384),

    context: ConversationBuffer(max_turns: 50)
}

Routing Strategies

CHAI supports multiple routing strategies to direct requests to the right specialist:

Semantic Router

Uses embedding similarity to match requests with specialist domains. Best for natural language queries.

Rule Router

Pattern matching with explicit rules. Best for structured inputs with clear categories.

LLM Router

A small model decides which specialist to invoke. Best for complex, ambiguous requests.

Cascade Router

Try specialists in order until one succeeds. Best for fallback scenarios.

Ensemble Patterns

Combine multiple specialists for better results:

ensembles.sx
// Parallel - all specialists work simultaneously
let results = await parallel(
    ask(summarizer, Summarize(doc)),
    ask(extractor, Extract(doc)),
    ask(classifier, Classify(doc))
)

// Voting - multiple specialists vote on a decision
let verdict = await vote(
    [judge1, judge2, judge3],
    Evaluate(submission),
    threshold: 0.6
)

// Chain - sequential pipeline processing
let result = doc
    |> ask(cleaner, Clean)
    |> ask(translator, Translate(to: "en"))
    |> ask(summarizer, Summarize)

Shared Memory

Specialists can share context through hive memory:

  • Vector Store: Semantic search across all specialists
  • Conversation Context: Shared conversation history
  • Working Memory: Short-term shared state with TTL
shared-memory.sx
specialist ResearchAssistant {
    // ...

    receive Research(topic: String) -> Report {
        // Search shared vector store
        let relevant = hive.memory.search(
            ai::embed(topic),
            limit: 10
        )

        // Get conversation context
        let history = hive.context.recent(5)

        // Generate with context
        let report = infer(build_prompt(topic, relevant, history))

        // Store for future searches
        hive.memory.store(report)

        report
    }
}

Cost Analysis

CHAI dramatically reduces AI costs compared to external APIs:

Configuration Monthly Cost
Small hive (5 specialists, CPU) ~$35/month
Medium hive (10 specialists, CPU) ~$85/month
High-performance (10 specialists, GPU) ~$1,200/month

Compared to GPT-4 API:

Requests/Month CHAI Cost API Cost Savings
100K $35 $300 88%
1M $85 $3,000 97%
10M $1,200 $30,000 96%

Naming Conventions

CHAI offers two naming traditions for specialists:

Elvish (Poetic)

Sindarin/Quenya names for an organic feel: Isto (Knowledge), Penna (Storyteller), Curu (Craft), Silma (Clarity)

Latin (Technical)

Classical names for a formal feel: Cogito (Think), Scribo (Write), Lego (Read), Faber (Craftsman)

Next Steps