Thing Event SystembyPentatonic
Blog/Engineering

Engineering

We gave AI agents a brain. And a conscience. And a perfect memory.

Every agent framework gives you inference. None of them give you continuity, accountability, or self-correction. We built all three.

March 18, 202618 min readPentatonic Engineering

Three systems. One stack. The first complete cognitive architecture for AI agents — persistent memory modelled on the mammalian brain, self-evolving bias detection that runs natural selection on its own rules, and an immutable event ledger that captures everything your agents see, decide, and do.

We've been building this for two years. Today it ships.

Agent Memory is a 7-layer cognitive architecture modelled on how biological brains actually work — hippocampal replay, cortical consolidation, engram networks, multi-trace retrieval. We studied how mammals form, store, and recall memories, and we built the same system for AI. The result is an agent that doesn't just retrieve information — it accumulates understanding, compounds knowledge across sessions, and reasons over relationships the way a human mind does. 47% better recall than anything else on the market. Runs entirely on-device. Zero cloud dependency.

Bias Evolution is a living detection system that evolves its own rules through natural selection. Six concurrent signal sources — including a dreaming engine that replays failures overnight — generate mutation hypotheses, score fitness against live data, and converge on bias patterns no human would ever think to look for. Provably constrained from degenerating. The system that finds bias you didn't know to look for.

TES is the sensory system — an immutable, event-sourced ledger that captures every event across an enterprise and makes it AI-queryable. Agent decisions, business operations, customer interactions, financial transactions, supply chain movements — every event gets AI enrichment, semantic embeddings, and reactive processing automatically. State is never stored. It's derived from the event stream. Agent Memory and Bias Evolution don't operate in a vacuum — TES is how they see the world.

Together, this is the upgrade from language model to autonomous intelligent system. The LLM provides inference. We provide everything else.


The Brain: Agent Memory

Agent Memory is a 7-layer cognitive architecture modelled on how the mammalian brain actually forms, stores, and recalls information. The hippocampus captures experiences rapidly. The neocortex consolidates them into structured knowledge. Engram networks distribute meaning across cell assemblies. Recall reconstructs from multiple traces simultaneously. We built each of those systems for AI — and wired them together.

The result is an agent that doesn't start from zero every session. Information flows down through seven layers, context flows up. The layers compensate for each other's weaknesses, the same way your brain uses different systems for different kinds of remembering. Where vector stores give you cosine similarity and call it retrieval, Agent Memory gives you actual cognition — accumulating understanding across sessions, compounding knowledge, reasoning over relationships.

LayerWhat it does
L0Platform adapter. Pluggable interface for any AI framework — instant keyword search across the full workspace.
L1System files. Persistent core context — always loaded, always available. The neocortical bedrock.
L2HybridRAG orchestrator. Multi-source fusion with query expansion and cross-encoder reranking. The conductor.
L3Knowledge graph. Rich entity taxonomy with hyperedge traversal and relationship reasoning. Memories aren't stored in single locations — they're distributed patterns. Entities as nodes, relationships as edges, meaning in topology.
L4Semantic search. 4096-dimensional embeddings — #1 ranked on MTEB. 5x more semantic surface area than standard 768-dim models.
L5Communications layer. Semantic search across chats, emails, contacts, meeting notes. Everything your agent has ever been told.
L6Document store. Hybrid semantic + full-text search across 20+ file formats with neural reranking.

The breakthrough that makes this work is graph-first retrieval. Every other memory system — every RAG implementation, every vector store, every framework built-in — does vector search first. Throw the query at an embedding space and hope the nearest neighbours are relevant. We do the opposite.

The entity graph traversal runs first. It extracts entities and relationships from the query — persons, projects, systems, decisions, commitments, incidents, transactions, deadlines, routines, lessons, preferences, topics, events, tools. That graph structure then informs the vector search. Sequential, not parallel. The graph tells the vector store what to look for. This mirrors how biological memory works: the brain doesn't do a similarity search across all neurons. It activates a network of associations, and those associations guide retrieval.

This is why "who is my accountant?" resolves via entity chain traversal, not cosine distance. Why "what's the margin on Project X?" returns the relationship-linked numeric context, not the five most similar paragraphs. Why "what did we decide about auth last month?" follows a decision entity to its connected commitments and deadlines, not a fuzzy text match.

Powering the graph is PRefLexOR — an 8-billion-parameter reasoning model built on Professor Markus Buehler's open-source architecture, fine-tuned for structured entity-relationship inference. Five-stage reasoning: concept identification, entity/relation mapping, machine-parseable graph output, causal pattern detection, and final synthesis with confidence scoring. Running 2-pass concurrent extraction — 16 parallel inference calls per wave — at 32 tokens/sec on-device. This isn't off-the-shelf RAG with a database bolted on. It's a reasoning engine that thinks in graphs.

tes-memory-query.js
import { TESClient } from "@pentatonic/ai-agent-sdk";

const tes = new TESClient({ endpoint: "http://localhost:4000/graphql" });

const results = await tes.query(`query {
  semanticSearchMemories(
    input: "What did we decide about auth?"
    strategy: HYBRID
    fusion: RECIPROCAL_RANK
    confidenceThreshold: 0.7
  ) {
    source   # KNOWLEDGE_GRAPH | VECTOR | FILES
    reasoning
    matches { content score entityId }
  }
}`);
// reasoning: "Found via entity 'auth_module' → 3 edges"
47%
Better recall
0.9
Context relevance
<30ms
Graph traversal
100%
On-device

The entire stack runs 100% on-device — zero cloud dependency, data never leaves the machine. Dual-backend embeddings automatically select the best available model for the hardware, from high-end GPU to laptop CPU, with graceful degradation and the same interface either way.

Coming next: multi-agent memory sharing via swarm intelligence protocols — agents sharing observations, debating interpretations, reaching consensus like distributed neurons forming a unified memory. Multi-agent memory isn't a shared database. It's collective cognition. More on this soon.

An LLM with Agent Memory doesn't just answer questions. It accumulates understanding. Session over session, it builds a structured model of entities, relationships, decisions, and context that compounds over time. That's not retrieval. That's cognition. And whoever owns it, owns the ecosystem.

Open source core (L0–L4) under Apache 2.0. Pro tier (L5–L6) adds communications ingestion, document store, and multi-agent memory sync.


The Conscience: Bias Evolution

Bias Evolution is a detection system that writes its own rules through natural selection. Six concurrent signal sources run over your event stream, generating mutation hypotheses, scoring fitness against live data, and converging on bias patterns that no human would think to look for. It finds decision skew, preference drift, information asymmetry, representation gaps, temporal anchoring, feedback amplification — emergent distortions that arise from interaction patterns and feedback loops, not from any single decision.

This matters because bias in agentic systems isn't a configuration error you can write a rule for. It's an emergent property of complex systems — and it evolves. Static rules catch yesterday's patterns. Threshold alerts miss 0.1%-per-day drift that compounds to 30% annual distortion. Periodic audits can't detect biases that shift when observed. Detection has to evolve as fast as the bias itself.

Bias Evolution does exactly that.

Six concurrent signal sources run over your event stream, each attacking the problem from a different angle:

01
Dreaming engine
Nightly failure analysis scans for 36+ failure patterns — errors, timeouts, auth failures, crashes, retries. Generates mutation candidates and auto-drafts patches from what went wrong while the system was asleep. Your agent literally dreams about its mistakes. Hippocampal replay for machines.
02
Symbolic learning
Weekly pattern extraction from the event history. Categorises signal into security, features, fixes, refactoring. Generates language gradients — structured descriptions of what should change in agent behaviour.
03
Active learning
Converts playbooks with 80%+ success rates into proactive behaviour rules. What worked becomes what's enforced.
04
Self-model
Tracks 8 capability dimensions — technical execution, communication, memory management, security, planning, relationship management, creative problem solving, self-improvement. Blind spot detection via failure clustering surfaces what the system doesn't know it's bad at.
05
Prompt evolution
Natural selection for system prompts. Extracts behavioural "genes" from agent definitions, applies mutations, measures fitness against live data. Cryptographically gated — no unsupervised self-modification.
06
Evolution review
Closed-loop measurement: leak count tracking, compliance rates, context overflow incidents, decision frequency analysis. The system knows whether it's actually improving, not just whether it's changing.

The evolution loop collects signals, formats them as mutation capsules, runs the mutation engine, and commits changes atomically. Detection rules are encoded as expression trees. Crossover and mutation operators generate novel hypotheses. Fitness functions select for rules that find real bias signal in live data. The system converges on anomalies no human wrote a rule for — decision skew, preference drift, information asymmetry, representation gaps, temporal anchoring, feedback amplification. When the population stabilises around a detection pattern, we extract and crystallise it. Proven patterns become permanent monitors. Exploration continues in other regions.

And it can't go wrong. Hard safety caps: 10 iterations max, 3-hour timeout, 60 files, 20K lines. Core identity files — the agent's fundamental operating principles — are cryptographically gated and never auto-modified. An Anti-Degeneration Lock enforces a strict priority ranking: Stability > Explainability > Reusability > Scalability > Novelty. Forbidden evolutions include fake intelligence, unverifiable mechanisms, vague concepts, and novelty bias. Every candidate mutation is scored by a Value Function with weighted dimensions (High Frequency 3x, Failure Reduction 3x, User Burden 2x). Score below 50 = don't build it.

This is the hard problem in self-improving AI: how do you let a system evolve without letting it degenerate? The answer is provable constraints, auditable mutations, and a fitness function that rewards reliability over novelty. We solved it.

No rules to write. No thresholds to tune. Connect your event stream; the evolution engine handles the rest. The system that finds bias you didn't know to look for.


The Eyes and Ears: TES Event Ledger

TES is the sensory system — a single, immutable event stream that captures everything that happens across an enterprise and makes it available to the brain and the conscience in real time. An agent makes a decision. A sensor fires. A model scores a risk assessment. A product changes hands. A patient record updates. A contract executes. A drone completes a survey. TES captures it, enriches it, embeds it, and serves it — regardless of where the event originated or what domain it belongs to.

This is what unifies the stack. Agent Memory doesn't just remember conversations — it remembers everything that happened across the enterprise, because TES feeds it a structured, semantically embedded stream of reality. Bias Evolution doesn't just watch agent decisions — it watches business operations, pricing patterns, settlement flows, customer interactions, model outputs. The more events that flow through TES, the smarter the brain gets and the sharper the conscience becomes.

State is never stored. It's derived. The current truth about any entity is the canonical sum of every event that ever happened to it, recomputed from the stream on demand. No stale caches. No mutable rows. No conflicting versions. Just the event stream and whatever you want to derive from it.

Every event triggers a two-stage reactive pipeline. Stage 1 is blocking: the event persists, the projection recomputes, a semantic embedding generates. Stage 2 fans out in parallel to 10+ independent consumers — AI enrichment (vision, classification, pricing, valuation), predictive models (demand forecasting, supply chain optimisation), settlement protocol adapters, usage metering, compliance logging, webhooks. Consumer failures never block the stream. Add a new consumer tomorrow; every past and future event flows through it automatically.

Every entity gets AI enrichment as a first-class operation — not a bolt-on analytics layer, but part of the event pipeline itself. Semantic embeddings, classification, market intelligence, predictive scoring. The data is born ready for intelligence. Every entity becomes queryable by meaning, not just by ID. Every entity becomes part of the structured, embedded stream that Memory reasons over and Bias evolves against.

Define any entity type. Emit events against it. TES handles the rest: projection computation, AI enrichment, embedding generation, consumer fan-out, correlation tracking, compliance logging. 40 event types across 8 categories today — agent operations, entity lifecycle, financial transactions, logistics, compliance, IoT telemetry, human interactions, usage metering — and extensible to any domain.

  • Time travel. Reconstruct any entity at any point in its lifecycle. What was the state three months ago? What changed and why?

  • Correlation. A single correlation ID links events across entity types and systems. One query returns the full journey.

  • Auditability by construction. You don't add audit trails later — the event stream is the audit trail.

  • Multi-protocol settlement. Financial events trigger settlement automatically — Agent Pay, x402, Stripe, Mastercard — with full double-entry ledger support.

  • Regulatory compliance as a byproduct. When a regulator asks "what happened and why?", the answer is tes.audit("session", "ses_7f2a", { format: "eu_ai_act" }). Structured. Exportable. Complete.

This is the same architectural pattern that underpins bank ledgers, air traffic control systems, and medical record infrastructure. We generalised it for the agentic era — because any system that wants memory, accountability, and self-correction needs an immutable, AI-enriched event stream underneath it. TES is that foundation.

P

Pentatonic Engineering

Building the infrastructure layer for the agentic era.

Try it yourself

Build with Agent Memory, Bias Evolution, and TES

Start emitting events in minutes. Free tier includes 10,000 events per month with AI enrichment and vector search.