Seann MacDougall

Lead AI & Systems Engineer

Selected Work // ekkOS Architecture

Building the Operating System for Agentic Memory

A technical walkthrough of ekkOS. I built this 12-layer cognitive substrate to solve context amnesia, giving AI agents persistent memory, continuous learning, and cross-session intelligence.

1,500+

Commits

Solo architect

372

DB Tables

Memory, traces, evals

185

API Functions

TypeScript SDK

77%

Cost Reduction

Token optimization

Section 01

The 12-Layer Cognitive Architecture

Most AI memory solutions rely on a single vector database. I wanted to build something closer to how actual cognition works. ekkOS uses 12 distinct layers, each serving a highly specific, structural purpose in the learning lifecycle.

Live Synaptic Map

Interactive: Drag / Zoom

Loading graph visualization...

Click and drag to explore

Scroll to zoom

Click nodes for details

The 12 Layers

Architecture

Layer Detail // L12

Obsidian-style agent knowledge mapping

Each layer is addressable via the MCP protocol. Agents can search, write, and reason across all 12 layers in a single tool call. The architecture supports both personal (per-user) and collective (cross-user) memory scopes.

Section 02

Memory & State Management

This is a full infrastructure layer, not just an API wrapper. It uses 372 database tables to handle memory, execution traces, embeddings, and real-time evaluations. Here is the actual data model that powers the swarm.

Core Data Model

types/memory.ts

// Pattern: a proven solution extracted
// from agent conversations
interface Pattern {
  id: string;
  title: string;
  problem: string;
  solution: string;
  confidence: number;    // 0.0 - 1.0
  success_rate: number;  // tracked via outcomes
  applied_count: number;
  tags: string[];
  embedding: number[];   // 1536-dim vector
  works_when: string[];
  anti_patterns: string[];
  created_at: Date;
  last_applied: Date;
}

// Active Forgetting: patterns decay
// if unused, get quarantined if failing
interface DecayConfig {
  decay_rate: number;       // 0.95 per period
  min_confidence: number;   // floor at 0.1
  quarantine_threshold: 0.3;
  promotion_threshold: 0.8;
}

How It Connects

ConversationsgenerateEpisodes

EpisodesextractPatterns

PatternstrackOutcomes

OutcomesadjustConfidence

ConfidencerankRetrieval

types/context.ts

// Semantic Rehydration: vector search
// replaces positional context lookup
interface ContextFrame {
  session_id: string;
  active_patterns: Pattern[];
  rehydrated_turns: Turn[];
  decay_rate: number;
  eviction_threshold: number;
}

// Search across evicted context using
// embedding similarity, not position
async function rehydrate(
  query: string,
  session: string
): Promise<ContextFrame> {
  const embedding = await embed(query);
  return searchEvictedContent(
    embedding, session
  );
}

Section 03

Cost & Performance Optimization

API costs can scale rapidly in multi-agent systems. I designed a custom cache-preserving router that detects tool round-trips and surgically skips pipeline stages, reducing token usage by 77% while maintaining complete context.

!The Problem

Claude Code makes 10-20+ API calls per user prompt (2-3 per tool round-trip). Standard proxy pipelines inject context at varying positions, breaking Anthropic's prompt cache on every call. Result: 10-20x cost multiplication from cache misses.

✓The Solution

Designed a "Cache-Preserving Passthrough" algorithm that detects tool round-trips and skips non-essential pipeline stages. Memory injections use Redis-cached content (stable prefix = cache hit). Emergency eviction only fires at 95% vs. the normal 75% threshold.

Key insight: Anthropic's prompt cache gives a 90% discount on cache hits. The cache works on the exact prefix of the messages array. Any content change at any position breaks the cache.

Token Usage Per Session (Relative)

Standard API (no cache)$1.20 / call

ekkOS Optimized$0.28 / call

Reduction77%

Eviction Cost

~$0.015

per eviction (Gemini 2.5 Flash)

80x ROI

vs. cache miss cost

Section 04

MCP Protocol Integration

Anthropic's Model Context Protocol (MCP) is the standard for connecting AI agents to external tools. ekkOS exposes all 11 memory layers as MCP tools.

Agent ↔ Memory Connection

IDE / Agent

CursorClaude CodeVS CodeWindsurf

Model Context Protocol (MCP)

185 RPC functions · Stdio transport · JSON-RPC 2.0

ekkOS Memory Infrastructure

SearchForgeRecallDirectivesPlansSecretsContext

Storage

PostgreSQLRedisNeo4jCloudflare R2

Cross-Platform Memory

Teach Claude something in VS Code, and Cursor already knows it. All agents share the same memory via MCP, regardless of which IDE or tool you use.

mcp-config.json

{
  "mcpServers": {
    "ekkos-memory": {
      "type": "sse",
      "url": "https://mcp.ekkos.dev/sse",
      "env": {
        "EKKOS_USER_ID": "your-user-id"
      }
    }
  }
}

// One config. Every IDE connected.
// Memory persists across sessions,
// tools, and even team members.

agent-example.ts

// Agent automatically searches memory
// before answering any question
const results = await ekkos.search({
  query: "supabase auth setup",
  sources: ["patterns", "episodic"]
});

// Found 3 patterns from past sessions
// Confidence: 0.94, 0.87, 0.72
// Agent applies the highest-confidence
// solution without re-deriving it.

Section 05

Developer Experience & SDK

Developer experience is critical. I wrote 185 API functions with strict TypeScript types, backed by automated tests, to provide a clean, deterministic SDK for other engineers to build on.

Search & Learn in Two Calls

sdk-usage.ts

import { EkkosClient } from '@ekkos/sdk';

const ekkos = new EkkosClient({
  transport: 'sse',
  endpoint: 'https://mcp.ekkos.dev'
});

// Search memory before answering
const context = await ekkos.search({
  query: 'deployment error on Railway',
  sources: ['patterns', 'episodic'],
  limit: 5
});

// Apply a pattern and track outcome
const app = await ekkos.track({
  pattern_id: context.patterns[0].id,
  context: { task: 'fix deployment' }
});

// Record success for reinforcement
await ekkos.outcome({
  application_id: app.id,
  success: true
});
// Pattern confidence: 0.82 → 0.86

Forge Solutions Automatically

forge-pattern.ts

// When a bug is fixed, capture it
await ekkos.forge({
  title: 'Railway PM2 restart loop',
  problem:
    'PM2 workers restart endlessly ' +
    'when memory exceeds 512MB limit',
  solution:
    'Set max_memory_restart to 450MB ' +
    'with graceful shutdown handler',
  tags: ['railway', 'pm2', 'memory'],
  works_when: [
    'Node.js worker on Railway',
    'PM2 cluster mode'
  ],
  anti_patterns: [
    'Increasing memory limit only ' +
    'delays the crash'
  ]
});

// Next time any agent hits this issue,
// the solution surfaces automatically.

185

RPC Functions

Test Cases

100%

TypeScript

Section 06

Custom Algorithms

Off-the-shelf tools weren't enough. I had to design four custom algorithms for AI memory management, each solving a specific failure mode I ran into during production.

Cache-Preserving Passthrough

Problem Def

Prompt cache misses cost 10-20x on every tool round-trip due to content injection at varying positions.

Resolution

Detect tool round-trips, skip non-essential pipeline stages, use Redis-cached stable prefix for cache hits.

Convergence Evaluator (Delta-Prometheus)

Problem Def

No way to measure if the system was actually improving. Agents hallucinated on complex tasks with no consistency check.

Resolution

Custom scoring formula that tracks pattern success rates, retrieval relevance, and error frequency over time to quantify improvement.

Active Forgetting Engine

Problem Def

Stale patterns accumulate forever. Bad solutions never get removed. Memory becomes noisy over time.

Resolution

Bio-inspired: quarantine failing patterns (<30% success), merge duplicates (>92% similarity), decay unused patterns, promote winners to collective.

Semantic Rehydration

Problem Def

After context eviction, positional lookup (last 5 turns) misses relevant history that appeared earlier in conversation.

Resolution

Replace positional lookup with vector similarity search across ALL evicted content. Always-on, not triggered by keywords.

Section 07

Failures & What I Learned

Building this wasn't easy. Here are a few major production failures I ran into, and the systems I had to architect to fix them.

Agents Were Hallucinating on Complex Tasks

Task completion: 40%

Without memory, agents re-derived solutions from scratch every session. They'd get different (often wrong) answers each time. No feedback loop meant bad patterns persisted.

Resolution Path Built the Convergence Evaluator to track answer consistency. Combined with pattern extraction and outcome tracking, task completion rose from 40% to 86.7%.

Serverless Cold Starts Wiped In-Memory State

Pattern tracking: broken

The pattern application store used an in-memory Map<string, Data>. Every Vercel cold start cleared it. Timeout-based auto-outcomes were unreliable in serverless environments because functions terminate before timers fire.

Resolution Path Migrated to a Redis-backed store with 5-minute TTL and in-memory fallback. Replaced setTimeout with lazy auto-outcome processing: on each new Track call, stale verified applications >30s old get processed first.

Context Eviction Caused Amnesia

Agents forgot mid-conversation

When context windows filled up, the eviction system removed older messages. But the rehydration system only looked at the last 5 turns, missing critical context from earlier in the conversation. Agents would "forget" decisions made 20 minutes ago.

Resolution Path Replaced positional lookup with semantic vector search across ALL evicted content. Now always-on (not keyword-triggered). Used Google AI embeddings (free tier) to keep costs at zero. Evicted content is stored losslessly in Cloudflare R2.

Key Achievements

High-Volume Engineering

1,500+ commits as sole architect. 372 database tables covering memory, traces, evaluations, and analytics. 10+ GitHub Actions CI/CD pipelines for automated testing and deployment.

Performance Improvement

Increased AI task completion rates from 40% to 86.7% through automated pattern extraction and outcome tracking. Continuous improvement via the Golden Loop feedback mechanism.

Algorithm Design

Developed 4 custom algorithms: Cache-Preserving Passthrough, Delta-Prometheus Convergence Evaluator, Active Forgetting Engine, and Semantic Rehydration. each solving a specific production failure mode.

Data Processing

Created a business logic extraction pipeline processing 2,500+ patterns with 92.8% success rate. Automated learning pipelines analyze agent conversations to construct knowledge graphs.

Tech Stack

Languages

TypeScript

Node.js

Python

SQL

AI / Agents

Claude

GPT

Gemini

MCP Protocol

Infrastructure

Vercel

Railway

GitHub Actions

Cloudflare R2

Data

PostgreSQL

Redis / Upstash

Neo4j

Vector Embeddings

Engineered by Seann MacDougall

Lead Systems Architect

I am a Lead AI & Systems Engineer focused on cognitive architecture and agentic memory. I designed and built ekkOS entirely from scratch—spanning 1,500+ commits, 372 database tables, and a local swarm daemon—as a solo architect. My focus is on building scalable, production-grade AI infrastructure.

GitHub LinkedIn seannmac@icloud.com

Whitby, Ontario · 289-927-0983

Visit ekkOS Platform Read the Docs