Intermediate Retrieval

Episodic Replay

Reconstruct past conversations in chronological order. Let your agent review the exact sequence of a debugging session, a design discussion, or any multi-step workflow — and continue with full context intact.

⏱ ~20 min to implement 📦 Requires: Dakera v0.11+

Start Building →

Prerequisites

Running Dakera server (Quickstart guide)
Familiarity with session IDs and metadata-based filtering
Understanding of sequence numbers vs. timestamps for ordering

The Problem

Semantic recall finds the most relevant memories — not the most sequential ones. When a user asks a coding assistant "what did we figure out in last Tuesday's debugging session?", the assistant needs to reconstruct the exact sequence of that session: the initial error, the hypotheses explored, the failed attempts, and the final fix. Pure vector recall returns the most relevant exchanges from that session but scrambles their order, destroying the narrative arc.

Without ordered replay, the agent cannot understand why a decision was made. It might resurface an approach that was already tried and failed in the same session. It cannot tell the user "we ruled out X because Y" if it cannot reconstruct the sequence in which X was explored and rejected. Causality requires order.

Replay vs. Cross-Session Context

Cross-Session Context retrieves the highest-relevance memories across many sessions. Episodic Replay retrieves all memories from a specific session in exact sequence. Use Cross-Session Context for "what do I know about this user?" Use Episodic Replay for "replay session X so I can continue from exactly where we left off."

Architecture: Episodic Memory Buffer with Replay Selection

Each interaction in a session is stored with two pieces of metadata: a session ID (identifies which conversation it belongs to) and a sequence number (an incrementing integer preserving order within that session). Replay is achieved by recalling memories filtered by session ID, then sorting by sequence number client-side. The sequence number is the canonical ordering key — timestamps have millisecond ambiguity; sequence numbers do not.

Step-by-Step Implementation

Generate a stable session ID at conversation start

Create a unique session ID (UUID or timestamp-based) at the start of each conversation. This ID ties all memories in the session together and is the key used for replay. Store it in your application's session state. Example: sess-20260519-u8a2f.
Store each interaction with session ID and sequence number

Every turn — both user messages and agent responses — gets stored immediately after it occurs. Include session_id and sequence in metadata. Use an integer counter starting at 1, not timestamps, for sequence. Two messages at the exact same millisecond would sort ambiguously with timestamps; integers never have this problem.
Tag messages by role and significance

Tag each memory with role:user or role:agent, and optionally mark significant exchanges: decision, breakthrough, error, solution. These tags enable selective replay — replaying only decision points rather than every turn, which saves context window space.
Recall the full session with high top_k

To replay a session, recall with the session ID as the query term and a high top_k (100+). Dakera's BM25 will match the session ID string. The recall returns memories in relevance order — sort them by metadata.sequence ascending after retrieval to restore the original order.
Compress completed sessions for long-term storage

After a session ends and its debugging task is complete, the full turn-by-turn replay becomes less valuable. Keep key decision points and final resolutions at high importance; reduce importance of exploratory back-and-forth. This keeps the memory store lean without losing the critical causal chain.

Implementation

# Store interaction 1: user reports error
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "codeassist-dev-7",
    "content": "[user] TypeError: cannot read property length of undefined in parser.js:42",
    "memory_type": "episodic",
    "importance": 0.80,
    "tags": ["episode", "sess-20260519", "role:user", "error"]
  }'

# Store interaction 2: agent hypothesis
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "codeassist-dev-7",
    "content": "[agent] Likely null check missing. Suggested: if (arr && arr.length) before accessing.",
    "memory_type": "episodic",
    "importance": 0.75,
    "tags": ["episode", "sess-20260519", "role:agent", "hypothesis"]
  }'

# Store interaction 4: root cause found (high importance)
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "codeassist-dev-7",
    "content": "[agent] Root cause: fetchData() not awaited at line 38 causing async race. Fix: add await.",
    "memory_type": "episodic",
    "importance": 0.95,
    "tags": ["episode", "sess-20260519", "role:agent", "decision", "root-cause"]
  }'

# Store interaction 5: confirmed fix (high importance)
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "codeassist-dev-7",
    "content": "[user] Fixed! Adding await before fetchData() resolved the TypeError. Tests green.",
    "memory_type": "episodic",
    "importance": 0.92,
    "tags": ["episode", "sess-20260519", "role:user", "solution", "resolved"]
  }'

# Replay the full session (sort by sequence client-side)
curl "http://localhost:3300/v1/memory/recall?agent_id=codeassist-dev-7&query=sess-20260519+debugging+session&top_k=100"   -H "Authorization: Bearer dk-..."

# Selective replay: only decisions and solutions
curl "http://localhost:3300/v1/memory/recall?agent_id=codeassist-dev-7&query=sess-20260519+decision+root+cause+solution&top_k=20&min_importance=0.90"   -H "Authorization: Bearer dk-..."

from dakera import DakeraClient
import uuid
from typing import Literal

client = DakeraClient(base_url="http://localhost:3300", api_key="dk-...")

Role = Literal["user", "agent"]
Significance = Literal["normal", "hypothesis", "error", "decision", "solution", "breakthrough"]

IMPORTANCE_BY_SIGNIFICANCE = {
    "normal": 0.72,
    "hypothesis": 0.75,
    "error": 0.80,
    "decision": 0.92,
    "solution": 0.95,
    "breakthrough": 0.97,
}

class EpisodicSession:
    """Records a conversation episode with full replay capability."""

    def __init__(self, agent_id: str, session_id: str = None):
        self.agent_id = agent_id
        self.session_id = session_id or f"sess-{uuid.uuid4().hex[:8]}"
        self.sequence = 0

    def record(
        self,
        role: Role,
        content: str,
        significance: Significance = "normal"
    ) -> dict:
        """Store one interaction turn."""
        self.sequence += 1
        importance = IMPORTANCE_BY_SIGNIFICANCE[significance]

        tags = [
            "episode",
            self.session_id,
            f"role:{role}",
            significance,
            f"seq:{self.sequence}"
        ]

        return client.store_memory(
            agent_id=self.agent_id,
            content=f"[{role}] {content}",
            memory_type="episodic",
            importance=importance,
            tags=tags
        )

    def replay_full(self) -> list[str]:
        """Replay all turns in this session in order."""
        results = client.recall(
            agent_id=self.agent_id,
            query=f"{self.session_id} episode conversation turns",
            top_k=200
        )
        memories = results.get("memories", [])
        # Sort by sequence number extracted from tags
        def get_seq(m):
            for tag in (m.get("tags") or []):
                if tag.startswith("seq:"):
                    return int(tag[4:])
            return 0
        memories.sort(key=get_seq)
        return [m["content"] for m in memories]

    def replay_decisions_only(self) -> list[str]:
        """Replay only high-significance turns (decisions, solutions)."""
        results = client.recall(
            agent_id=self.agent_id,
            query=f"{self.session_id} decision root cause solution breakthrough",
            top_k=20,
            min_importance=0.90
        )
        memories = results.get("memories", [])

        def get_seq(m):
            for tag in (m.get("tags") or []):
                if tag.startswith("seq:"):
                    return int(tag[4:])
            return 0
        memories.sort(key=get_seq)
        return [m["content"] for m in memories]

# --- Record the debugging session ---
session = EpisodicSession("codeassist-dev-7", "sess-20260519")

session.record("user",
    "TypeError: cannot read property 'length' of undefined in parser.js:42",
    significance="error"
)
session.record("agent",
    "Likely a null check missing. Try: if (arr && arr.length) before accessing.",
    significance="hypothesis"
)
session.record("user",
    "Still failing on the same line. I already had that null check.",
    significance="normal"
)
session.record("agent",
    "Root cause: fetchData() at line 38 is not awaited. Async race condition. Add 'await' before the call.",
    significance="decision"
)
session.record("user",
    "That was it! Adding await before fetchData() fixed the TypeError. Tests all green.",
    significance="solution"
)

# --- Replay later (new process, resuming work) ---
replay_session = EpisodicSession("codeassist-dev-7", "sess-20260519")

print("=== FULL REPLAY ===")
for turn in replay_session.replay_full():
    print(turn)

print("
=== DECISIONS ONLY ===")
for turn in replay_session.replay_decisions_only():
    print(turn)

# Output (decisions only):
# [agent] Root cause: fetchData() not awaited. Async race. Add 'await'.
# [user] Fixed! Tests green.

import { DakeraClient } from '@dakera-ai/dakera';
import { randomUUID } from 'crypto';

const client = new DakeraClient({ baseUrl: 'http://localhost:3300', apiKey: 'dk-...' });

type Role = 'user' | 'agent';
type Significance = 'normal' | 'hypothesis' | 'error' | 'decision' | 'solution' | 'breakthrough';

const IMPORTANCE: Record<Significance, number> = {
  normal: 0.72,
  hypothesis: 0.75,
  error: 0.80,
  decision: 0.92,
  solution: 0.95,
  breakthrough: 0.97,
};

class EpisodicSession {
  private agentId: string;
  private sessionId: string;
  private seq = 0;

  constructor(agentId: string, sessionId?: string) {
    this.agentId = agentId;
    this.sessionId = sessionId ?? `sess-${randomUUID().slice(0, 8)}`;
  }

  get id(): string { return this.sessionId; }

  async record(role: Role, content: string, sig: Significance = 'normal'): Promise<void> {
    this.seq += 1;
    const seqNum = this.seq;

    await client.storeMemory(this.agentId, {
      content: `[${role}] ${content}`,
      memoryType: 'episodic',
      importance: IMPORTANCE[sig],
      tags: ['episode', this.sessionId, `role:${role}`, sig, `seq:${seqNum}`],
    });
  }

  private getSeq(tags: string[] = []): number {
    const seqTag = tags.find(t => t.startsWith('seq:'));
    return seqTag ? parseInt(seqTag.slice(4), 10) : 0;
  }

  async replayFull(): Promise<string[]> {
    const results = await client.recall(
      this.agentId,
      `${this.sessionId} episode conversation turns`,
      { top_k: 200 }
    );
    return results.memories
      .sort((a, b) => this.getSeq(a.tags) - this.getSeq(b.tags))
      .map(m => m.content);
  }

  async replayDecisionsOnly(): Promise<string[]> {
    const results = await client.recall(
      this.agentId,
      `${this.sessionId} decision root cause solution breakthrough`,
      { top_k: 20, min_importance: 0.90 }
    );
    return results.memories
      .sort((a, b) => this.getSeq(a.tags) - this.getSeq(b.tags))
      .map(m => m.content);
  }
}

// --- Record the debugging session ---
const session = new EpisodicSession('codeassist-dev-7', 'sess-20260519');

await session.record('user', "TypeError: cannot read property 'length' of undefined in parser.js:42", 'error');
await session.record('agent', "Likely null check missing. Try: if (arr && arr.length) before accessing.", 'hypothesis');
await session.record('user', "Still failing on same line. Already had that null check.", 'normal');
await session.record('agent', "Root cause: fetchData() at line 38 not awaited. Async race. Add await.", 'decision');
await session.record('user', "That fixed it! Tests green. Shipping to staging.", 'solution');

// --- Replay later ---
const replaySession = new EpisodicSession('codeassist-dev-7', 'sess-20260519');

const allTurns = await replaySession.replayFull();
console.log('Full replay:', allTurns);

const keyTurns = await replaySession.replayDecisionsOnly();
console.log('Decisions only:', keyTurns);

use dakera_rs::{Client, StoreMemoryRequest, RecallRequest};

let client = Client::new("http://localhost:3300", "dk-...");
let agent_id = "codeassist-dev-7";
let session_id = "sess-20260519";

struct Turn {
    role: &'static str,
    content: &'static str,
    importance: f32,
    seq: u32,
}

let turns = vec![
    Turn { role: "user", content: "TypeError: cannot read property length of undefined in parser.js:42", importance: 0.80, seq: 1 },
    Turn { role: "agent", content: "Likely null check missing. Try: if (arr && arr.length) before accessing.", importance: 0.75, seq: 2 },
    Turn { role: "user", content: "Still failing. Already tried the null check.", importance: 0.72, seq: 3 },
    Turn { role: "agent", content: "Root cause: fetchData() not awaited at line 38. Async race. Add await.", importance: 0.92, seq: 4 },
    Turn { role: "user", content: "Fixed! Tests all green. Shipping to staging.", importance: 0.95, seq: 5 },
];

// Store all turns
for turn in &turns {
    client.store_memory(agent_id, StoreMemoryRequest {
        content: format!("[{}] {}", turn.role, turn.content),
        memory_type: "episodic".into(),
        importance: Some(turn.importance),
        tags: vec![
            "episode".into(),
            session_id.into(),
            format!("role:{}", turn.role),
            format!("seq:{}", turn.seq),
        ],
        ..Default::default()
    }).await?;
}

// Replay: recall all, sort by seq tag
let results = client.recall(agent_id, RecallRequest {
    query: format!("{} episode conversation turns", session_id),
    top_k: Some(200),
    ..Default::default()
}).await?;

let mut memories = results.memories;
memories.sort_by_key(|m| {
    m.tags.iter()
        .find(|t| t.starts_with("seq:"))
        .and_then(|t| t[4..].parse::<u32>().ok())
        .unwrap_or(0)
});

for m in &memories {
    println!("{}", m.content);
}

package main

import (
    "context"
    "fmt"
    "sort"
    "strconv"
    "strings"
    dakera "github.com/dakera-ai/dakera-go"
)

type Turn struct {
    Role       string
    Content    string
    Importance float64
    Seq        int
}

func getSeq(tags []string) int {
    for _, tag := range tags {
        if strings.HasPrefix(tag, "seq:") {
            if n, err := strconv.Atoi(tag[4:]); err == nil {
                return n
            }
        }
    }
    return 0
}

func main() {
    client := dakera.NewClient("http://localhost:3300", "dk-...")
    ctx := context.Background()
    agentID := "codeassist-dev-7"
    sessionID := "sess-20260519"

    turns := []Turn{
        {"user", "TypeError: cannot read property length of undefined in parser.js:42", 0.80, 1},
        {"agent", "Likely null check missing. Try: if (arr && arr.length) before accessing.", 0.75, 2},
        {"user", "Still failing. Already tried the null check.", 0.72, 3},
        {"agent", "Root cause: fetchData() at line 38 not awaited. Async race. Fix: add await.", 0.92, 4},
        {"user", "Fixed! Tests green. Shipping to staging.", 0.95, 5},
    }

    // Record all turns
    for _, t := range turns {
        client.StoreMemory(ctx, agentID, dakera.StoreMemoryRequest{
            Content:    fmt.Sprintf("[%s] %s", t.Role, t.Content),
            MemoryType: "episodic",
            Importance: t.Importance,
            Tags: []string{
                "episode", sessionID,
                "role:" + t.Role,
                fmt.Sprintf("seq:%d", t.Seq),
            },
        })
    }

    // Replay: recall and sort by seq
    results, _ := client.Recall(ctx, agentID, dakera.RecallRequest{
        Query: sessionID + " episode conversation turns",
        TopK:  200,
    })

    sort.Slice(results.Memories, func(i, j int) bool {
        return getSeq(results.Memories[i].Tags) < getSeq(results.Memories[j].Tags)
    })

    for _, m := range results.Memories {
        fmt.Println(m.Content)
    }
}

Give your coding assistant perfect memory of every debug session.

Episodic replay preserves the causal chain your agent needs to continue intelligently.

Get Started →

Before & After: Replay State

The left shows what a semantic recall returns for a debugging session — relevant but unordered. The right shows episodic replay with sequence metadata — ordered and causally coherent.

Before: Semantic Recall Only

{
  // Recall returns by relevance,
  // not by order. Causality is lost.

  query: "debugging session",
  results: [
    "#5: Fixed! Tests green.",       // relevant
    "#4: Root cause: async race.",   // relevant
    "#1: TypeError at parser.js",    // relevant
    "#2: Try null check fix",        // relevant
    // #3 may not appear at all
    // (user said "still failing" —
    //  low semantic relevance)
  ]
  // Agent cannot tell #3 happened
  // BEFORE #4 — cannot know that
  // null check was already tried.
}

After: Episodic Replay with Sequence

{
  // Recall + sort by seq metadata.
  // Full causal chain restored.

  session_id: "sess-20260519",
  replay: [
    "#1 [user] TypeError in parser.js",
    "#2 [agent] Try null check fix",
    "#3 [user] Still failing (tried it)",
    "#4 [agent] Root: async race, await!",
    "#5 [user] Fixed! Tests green.",
  ],
  // Agent now KNOWS: null check was
  // tried and failed (#3), so it will
  // never suggest it again. The fix
  // was adding await (#4).
}

Real-World Example: Coding Assistant Reviewing Past Debug Sessions

Scenario: Orbit Code is an AI pair programmer deployed to engineering teams. Developers spend hours in complex debugging sessions and frequently return to the same codebase days later. Without episodic replay, the assistant starts fresh each time. With it, the assistant can reference specific past sessions by name and continue with full context.

A Typical Developer Workflow Across 3 Days

Day 1 (Tuesday), Session sess-20260519: Developer Priya spends 90 minutes debugging an async race condition in the parser module. Five turns are recorded. The breakthrough (adding await) is tagged as a decision at importance 0.92. The session ends with the fix shipped to staging.

Day 3 (Thursday): Priya opens a new session. She says "I need to refactor the parser module we fixed Tuesday, can you remind me what we found?" The assistant replays sess-20260519 in order, identifies the root cause (#4) and resolution (#5), and responds: "In Tuesday's session, we found that fetchData() wasn't awaited at line 38, causing a race condition. We added await and it resolved the TypeError. The refactoring should ensure all async calls in that module are properly awaited." Priya says it felt like talking to a teammate who was there.

Day 5 (Saturday): A new bug appears in the same module. The agent proactively replays the related sessions and warns: "This looks similar to the async race we fixed in session sess-20260519. Check if the new processChunks() function is missing an await." It was. Fix takes 3 minutes instead of 90.

Pro Tip: Selective Replay Saves Context Window

Full episodic replay of a 50-turn session can consume 3,000–5,000 tokens in your LLM context window. For resumption scenarios, use selective replay: replay only turns tagged as decision, breakthrough, or solution with min_importance=0.90. This typically reduces token cost by 80% while preserving all causally important information. Reserve full replay for explicit user requests ("show me the full session").

Replay Selection Algorithm: From Archive to Active Context

This diagram shows how the importance filter + tag selector works together to extract a minimal, causally relevant replay set from a large episodic archive.

Performance Characteristics

<25ms

Full session replay (50 turns, top_k=200)

200

Max recommended turns per session before compression

~80%

Context window savings with selective replay

Recall with top_k=200 for a session of 50 turns returns in approximately 20–25 ms. The client-side sort by sequence number adds under 1 ms. For sessions exceeding 200 turns, consider compressing intermediate exchanges into summaries using the Summarization & Decay pattern, keeping only decision points and the final resolution at full importance.

Edge Cases & Developer Gotchas

Gotcha 1: Sequence Numbers Must Be Monotonic

If you use floating point timestamps instead of integer sequence counters, two events that occur within the same millisecond (fast async code, batch operations) will have identical or reversed ordering. Solution: Use an integer counter that increments atomically with each stored turn. Keep this counter in your application state for the duration of the session. Never rely on timestamps alone for sequence ordering.

Gotcha 2: top_k Too Small Drops Turns

If a session has 80 turns and you set top_k=50, the 30 lowest-importance turns (typically the middle back-and-forth) are silently dropped. The replay is incomplete and the sequence has gaps. Solution: Always set top_k to at least 2x the expected session length. Use a generous default of 200 for replay queries. Storage cost per turn is negligible; do not optimize aggressively here.

Gotcha 3: Cross-Session Session ID Collisions

If your session ID generation is weak (e.g., sequential integers) and two different users happen to have the same session ID, a replay query will return mixed memories from different users. Solution: Use UUIDs or prefix session IDs with the user ID: user-alice-sess-a8f2b1. Never rely on simple numeric session IDs across a multi-tenant system.

Gotcha 4: Storing Both Sides Doubles Your Memory Count

Every session turn requires two stores (user message + agent response). A 40-turn conversation produces 80 memory entries. For long-lived users with hundreds of sessions, this accumulates quickly. Solution: Only store agent responses that contain decisions, hypotheses, or significant information. Skip agent responses that are pure clarifying questions ("Could you share the error message?") as they add noise without signal.

Gotcha 5: Recall Includes Memories From Other Sessions

When replaying session "sess-abc123", your query "sess-abc123 debugging turns" might also match memories from other sessions that contain similar debugging content. The session ID in the query helps but is not a hard filter. Solution: After recall, filter client-side by checking that metadata.session_id === targetSessionId or that the session ID tag is present in the memory's tags array. Only sort the filtered set.

SDK Reference

Operation	Python	TypeScript	Purpose
Store episode turn	`client.store_memory(agent_id, content, importance, memory_type, tags)`	`client.storeMemory(agentId, {content, importance, memoryType, tags})`	Persist one conversation turn
Recall for replay	`client.recall(agent_id, query, top_k)`	`client.recall(agentId, query, {top_k})`	Retrieve session turns (sort by seq after)
Selective replay	`client.recall(agent_id, query, top_k, min_importance)`	`client.recall(agentId, query, {top_k, min_importance})`	Only decisions/solutions (importance ≥ 0.90)
List sessions	`client.list_sessions()`	`client.listSessions()`	Browse past sessions to select for replay
Get session memories	`client.session_memories(session_id)`	`client.sessionMemories(sessionId)`	Direct session-scoped memory access
Compress old episodes	`client.update_importance(agent_id, memory_id, importance)`	`client.updateImportance(agentId, request)`	Archive old exploratory turns
Delete episode	`client.forget(agent_id, memory_id)`	`client.forget(agentId, memoryId)`	Remove a single turn from history
Bulk delete session	`client.batch_forget(request)`	—	Delete all turns of a completed session

Advanced Configuration

Session Compression After Completion

Once a debugging session is resolved, compress exploratory turns and keep only the causal chain:

def compress_completed_session(session_id: str, agent_id: str):
    """
    After session ends: keep decisions + solutions at high importance,
    archive exploratory turns to low importance.
    """
    results = client.recall(
        agent_id=agent_id,
        query=f"{session_id} episode",
        top_k=200
    )
    for m in results["memories"]:
        tags = m.get("tags", [])
        is_key = any(t in tags for t in ["decision", "solution", "breakthrough"])
        new_importance = m["importance"] if is_key else 0.30
        client.update_importance(
            agent_id=agent_id,
            memory_id=m["id"],
            importance=new_importance
        )

Cross-Session Pattern Detection

Use replay across multiple sessions to find recurring patterns (e.g., the same error appearing in multiple sessions):

results = client.batch_recall({
    "queries": [
        {"agent_id": "codeassist-dev-7", "query": "TypeError async race condition sessions", "top_k": 20},
        {"agent_id": "codeassist-dev-7", "query": "null reference undefined errors root cause", "top_k": 20},
        {"agent_id": "codeassist-dev-7", "query": "solutions that worked parser module fixes", "top_k": 10}
    ]
})
# Detect that async race appeared in 3 separate sessions
# Recommend: add async linting rule to prevent recurrence

Training Data Extraction from Successful Sessions

Sessions with confirmed solutions make excellent fine-tuning examples. Extract decision + resolution pairs:

training_pairs = []
sessions = client.list_sessions()
for session in sessions:
    memories = client.session_memories(session["id"])
    decisions = [m for m in memories if "decision" in (m.get("tags") or [])]
    solutions = [m for m in memories if "solution" in (m.get("tags") or [])]
    if decisions and solutions:
        training_pairs.append({
            "problem": decisions[0]["content"],
            "resolution": solutions[-1]["content"]
        })

When to Use This Pattern

Coding assistants that debug multi-step issues across multiple sessions
Agent frameworks that need to resume interrupted multi-step workflows
Audit trail systems requiring exact chronological reconstruction
Training data extraction: successful session turns as fine-tuning examples
Quality assurance: replay agent decision sequences to investigate bad outcomes
Design review assistants tracking how requirements evolved over discussions

Build AI that never forgets a debugging session

Episodic replay gives your coding assistant perfect memory of every causal chain, every decision, every fix.

Read the Quickstart → API Reference