LLM observability and agent memory.
Open source. Runs locally.
Wrap OpenAI, Anthropic, or Workers AI — track every call, tool use, and session locally with a 4-layer memory stack. No account required. Connect to the hosted TES platform when you need 7-layer memory, AI enrichment, and vector search at scale.
Claude Code plugin →import { TESClient } from "@pentatonic/ai-agent-sdk";
import Anthropic from "@anthropic-ai/sdk";
// Local mode — no API key, no account
const tes = new TESClient();
// One line. That's it.
const ai = tes.wrap(new Anthropic());
// Every call is tracked locally with 4-layer memory
const res = await ai.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello" }],
});
// Locally captured:
// → CHAT_TURN { tokens: 38, model: "claude-sonnet-4...", ... }
// → Tool calls, sessions, token usage — all stored locallyGet started in 30 seconds
Three steps, then you're done
Install
Install from npm or PyPI. Or use the interactive CLI to set up your account and SDK in one step.
Wrap your client
Pass your existing LLM client to tes.wrap(). It returns a transparent proxy — your existing code doesn't change.
Done
Every LLM call is tracked locally — token usage, tool calls, sessions. The 4-layer memory stack runs on your machine with no external dependencies. Add your TES credentials to upgrade to the 7-layer hosted stack and AI enrichment.
npm install @pentatonic/ai-agent-sdkTwo tiers
Open source core. Hosted platform when you scale.
The SDK runs entirely locally with no account or API key. Connect to the hosted TES platform to unlock 7-layer memory, AI enrichment, and a global immutable event store.
Open source
Free foreverInstall from npm or PyPI and run entirely on your own infrastructure. No account, no API key, no egress.
- LLM call wrapping — OpenAI, Anthropic, Workers AI
- Token usage tracking per call and session
- Tool call capture with args and results
- Response normalisation across providers
- Click tracking with HMAC-SHA256 signing
- 4-layer local memory stack
- Claude Code plugin for session tracking
- JavaScript + Python, MIT licensed
Hosted TES platform
Per eventPoint the SDK at the TES platform for a hosted, globally distributed event store with advanced memory and AI enrichment.
- Everything in open source, plus:
- 7-layer bio-inspired memory stack
- AI enrichment pipeline — vision, taxonomy, pricing
- 1024-dim vector embeddings, semantic search
- Immutable event store on Cloudflare edge (300+ locations)
- Bias Evolution — self-evolving drift detection
- EU AI Act compliance exports
- Multi-tenant with data residency per client
Open source — local
// No API key needed — runs locally
const tes = new TESClient();
const ai = tes.wrap(new OpenAI());Hosted TES — add two env vars
// Point at hosted TES for 7-layer memory + enrichment
const tes = new TESClient({
apiKey: process.env.TES_API_KEY,
clientId: process.env.TES_CLIENT_ID,
});
const ai = tes.wrap(new OpenAI()); // same APIProvider agnostic
Three providers, one interface
The SDK detects your client type and intercepts the right method. Switch providers without changing your tracking code.
JavaScript / TypeScript
import { TESClient } from "@pentatonic/ai-agent-sdk";
import OpenAI from "openai";
const tes = new TESClient({
apiKey: process.env.TES_API_KEY,
clientId: process.env.TES_CLIENT_ID,
});
const ai = tes.wrap(new OpenAI());
const res = await ai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
});
// → CHAT_TURN event emitted automaticallyPython
from pentatonic_agent_events import TESClient
from openai import OpenAI
tes = TESClient(
api_key=os.environ["TES_API_KEY"],
client_id=os.environ["TES_CLIENT_ID"],
)
ai = tes.wrap(OpenAI())
res = ai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
# → CHAT_TURN event emitted automatically| Provider | Detection | Intercepted method |
|---|---|---|
| OpenAI | client.chat.completions.create | chat.completions.create |
| Anthropic | client.messages.create | messages.create |
| Workers AI | client.run | run |
Flexible
Two modes
Auto-wrap for zero boilerplate. Manual session for full control. Both emit the same immutable TES events.
Auto-wrap
Pass your LLM client to tes.wrap(). Tracking happens automatically — no changes to your existing call sites.
const tes = new TESClient({ apiKey, clientId });
const ai = tes.wrap(new OpenAI());
// Your code stays exactly the same
const res = await ai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
});
// TES event emitted in the backgroundManual session
Create a session directly for multi-round tool-calling loops or when you need to attach metadata to individual turns.
const session = tes.session({
sessionId: "conv-123",
metadata: { userId: "u_1", channel: "api" },
});
// After your LLM call
session.record(rawLLMResponse);
// Emit when the turn is complete
await session.emitChatTurn({
userMessage: "Find items under $50",
assistantResponse: res.content[0].text,
});
// Accumulated token totals across all rounds
console.log(session.totalUsage);
// { prompt_tokens: 142, completion_tokens: 87, ai_rounds: 3 }Hosted platform
More when you connect to TES
The hosted TES platform adds 7-layer memory, AI enrichment, vector search, and an enterprise immutable event store. Same SDK — just two extra env vars.
Agent Memory
7-layer bio-inspired persistent memory for AI agents. Episodic, semantic, procedural — with vector search and confidence decay.
Learn more →Bias Evolution
Self-evolving bias detection that runs evolutionary loops over your event streams. Catches drift before it reaches production.
Learn more →TES x NemoClaw
NVIDIA built the sandbox. We built the brain. The first complete enterprise agent stack on GB10 Blackwell.
Learn more →Claude Code Plugin
Persistent memory and automatic session tracking for Claude Code. Install the MCP plugin and every session is searchable.
Learn more →Reference
API reference
The full public surface of the SDK. Zero runtime dependencies.
new TESClient(options)Create a client. Required: apiKey, clientId. Optional: endpoint, captureContent, maxContentLength.
tes.wrap(client, opts?)Return a transparent proxy around your LLM client. Every call is auto-tracked.
tes.session(opts?)Create a manual Session for full control over multi-round conversations.
session.record(rawResponse)Normalise a raw provider response and accumulate token usage.
session.emitChatTurn({...})Emit a CHAT_TURN event with user message, response, token totals.
session.emitToolUse({...})Emit a TOOL_USE event with tool name, args, and optional result.
session.emitSessionStart()Emit a SESSION_START event at the beginning of a conversation.
normalizeResponse(raw)Standalone utility. Converts any provider response into { content, model, usage, toolCalls }.
Open source
Start local. Scale with TES.
Install the open-source SDK and run locally for free. When you need 7-layer memory, AI enrichment, and a hosted event store — the hosted platform is one config change away.