Beginner Retrieval

User Preference Recall

Teach your agent to remember what each user cares about. Preferences compound over time into a personalization layer that makes every interaction feel tailor-made.

⏱ ~15 min to implement 📦 Requires: Dakera v0.11+

Start Building →

Prerequisites

Running Dakera server (Quickstart guide)
An SDK installed: Python, TypeScript, Rust, or Go
Basic familiarity with agent IDs and memory types

The Problem

Users expect AI assistants to remember them. The moment a user has to re-explain their preferred coding language, their dietary restrictions, or their time zone for the third time, trust in the AI erodes. Stateless AI is frustrating AI.

The naive solution — stuffing the entire chat history into the context window — hits two walls fast: token limits cap the lookback window at a few sessions, and raw chat is noisy signal for preference extraction. What you actually need is a lightweight, high-signal preference layer that accumulates facts about each user and surfaces the most relevant ones on demand.

Why not just use a user database?

SQL user tables store structured fields. Preferences are unstructured, fuzzy, and contextual — "prefers concise explanations when stressed" doesn't map to a column. Dakera's hybrid retrieval understands semantic intent alongside exact match, letting you store and retrieve freeform preferences that relational systems cannot.

Architecture: The Preference Accumulation Loop

This pattern uses a dedicated preferences namespace per user, separate from conversation history. Preferences are stored with high importance scores (0.85–1.0), which signals Dakera to surface them above general episodic memories. The hybrid retrieval engine combines BM25 keyword matching with vector similarity, so both exact queries ("timezone") and semantic queries ("what hours do they work") return the right memories.

Step-by-Step Implementation

Provision a preferences namespace per user

Use a consistent agent ID pattern like user-{userId}-prefs. This keeps preference memories isolated from conversation history, avoiding retrieval noise. Never mix preferences with episodic memories in the same agent ID.
Detect preferences during the conversation

Add a lightweight extraction step after each user message. Look for explicit statements ("I prefer X"), implicit signals ("can you use bullet points?"), and corrections ("actually, I switched to TypeScript"). Assign importance 0.85–1.0 for declared preferences and 0.7–0.84 for inferred ones.
Store with categorical tags

Tag each preference with a category such as language, format, timezone, dietary, or communication. Tags enable targeted recall — when generating code, only fetch language and format preferences, not dietary ones.
Recall at session start with a broad query

On every new session, fire a recall with min_importance=0.8 and a generic query like "user preferences and settings". Inject the results into your system prompt before the first user message arrives. Latency for this call is typically 8–15 ms.
Handle preference updates and conflicts

When a user changes a preference, use update_memory to overwrite the old value rather than storing a duplicate. If you cannot identify the exact memory ID, store the new preference with a higher importance score — Dakera's ranking will surface the newer, higher-scored memory first.

Implementation

# 1. Store a detected preference
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "user-alice-prefs",
    "content": "User prefers Python over JavaScript for all backend code examples",
    "importance": 0.95,
    "memory_type": "semantic",
    "tags": ["preference", "language", "coding"]
  }'

# 2. Store formatting preference
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "user-alice-prefs",
    "content": "User wants responses as concise bullet points, not paragraphs",
    "importance": 0.9,
    "memory_type": "semantic",
    "tags": ["preference", "format", "communication"]
  }'

# 3. Store timezone preference
curl -X POST http://localhost:3300/v1/memory/store   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "agent_id": "user-alice-prefs",
    "content": "User is in Pacific timezone (UTC-8), works 9am-6pm PST",
    "importance": 0.92,
    "memory_type": "semantic",
    "tags": ["preference", "timezone", "availability"]
  }'

# 4. Recall relevant preferences at session start
curl "http://localhost:3300/v1/memory/recall?agent_id=user-alice-prefs&query=coding+language+style+preferences&top_k=5&min_importance=0.8"   -H "Authorization: Bearer dk-..."

# 5. Update a changed preference
curl -X PUT http://localhost:3300/v1/memory/{memory_id}   -H "Authorization: Bearer dk-..."   -H "Content-Type: application/json"   -d '{
    "content": "User switched from Python to TypeScript for new projects (May 2026)",
    "importance": 0.95
  }'

from dakera import DakeraClient
from typing import Optional

client = DakeraClient(base_url="http://localhost:3300", api_key="dk-...")

PREF_CATEGORIES = {
    "language": ["python", "javascript", "typescript", "rust", "go", "ruby"],
    "format": ["bullet", "concise", "verbose", "markdown", "code block"],
    "timezone": ["utc", "pst", "est", "gmt", "cet"],
    "communication": ["formal", "casual", "brief", "detailed"],
}

def detect_preference_category(content: str) -> list[str]:
    """Simple heuristic — replace with LLM classifier in production."""
    tags = ["preference"]
    content_lower = content.lower()
    for cat, keywords in PREF_CATEGORIES.items():
        if any(kw in content_lower for kw in keywords):
            tags.append(cat)
    return tags

def store_preference(
    user_id: str,
    preference: str,
    importance: float = 0.9,
    memory_id: Optional[str] = None
) -> dict:
    """Store or update a user preference."""
    tags = detect_preference_category(preference)

    if memory_id:
        # Update existing preference
        return client.update_memory(
            agent_id=f"user-{user_id}-prefs",
            memory_id=memory_id,
            content=preference,
            importance=importance
        )

    return client.store_memory(
        agent_id=f"user-{user_id}-prefs",
        content=preference,
        memory_type="semantic",
        importance=importance,
        tags=tags
    )

# Seed Alice's preferences on first interaction
store_preference("alice", "User prefers Python over JavaScript for all backend code", 0.95)
store_preference("alice", "User wants concise bullet-point responses, not paragraphs", 0.9)
store_preference("alice", "User works in Pacific timezone (UTC-8), 9am-6pm PST", 0.92)
store_preference("alice", "User is senior engineer, skip beginner explanations", 0.88)
store_preference("alice", "User prefers async/await patterns over callbacks", 0.87)

def build_system_prompt(user_id: str, task_context: str) -> str:
    """Build a personalized system prompt from stored preferences."""
    prefs = client.recall(
        agent_id=f"user-{user_id}-prefs",
        query=f"preferences relevant to: {task_context}",
        top_k=5,
        min_importance=0.8
    )

    if not prefs.get("memories"):
        return "You are a helpful assistant."

    pref_lines = "
".join(
        f"- {m['content']}" for m in prefs["memories"]
    )
    return f"""You are a personalized assistant. Apply these known user preferences:
{pref_lines}

Always follow these preferences without being asked."""

# Use at the start of every session
system = build_system_prompt("alice", "writing Python API code")
print(system)
# Output:
# You are a personalized assistant. Apply these known user preferences:
# - User prefers Python over JavaScript for all backend code
# - User wants concise bullet-point responses, not paragraphs
# - User is senior engineer, skip beginner explanations
# ...

import { DakeraClient } from '@dakera-ai/dakera';

const client = new DakeraClient({ baseUrl: 'http://localhost:3300', apiKey: 'dk-...' });

interface PreferenceOptions {
  importance?: number;
  tags?: string[];
  memoryId?: string;
}

async function storePreference(
  userId: string,
  preference: string,
  options: PreferenceOptions = {}
): Promise<void> {
  const { importance = 0.9, tags = ['preference'], memoryId } = options;

  if (memoryId) {
    await client.updateMemory(`user-${userId}-prefs`, memoryId, {
      content: preference,
      importance
    });
    return;
  }

  await client.storeMemory(`user-${userId}-prefs`, {
    content: preference,
    memoryType: 'semantic',
    importance,
    tags
  });
}

// Seed preferences
await storePreference('alice', 'User prefers Python for backend, TypeScript for frontend', {
  importance: 0.95,
  tags: ['preference', 'language']
});
await storePreference('alice', 'User wants concise bullet-point responses', {
  importance: 0.90,
  tags: ['preference', 'format']
});
await storePreference('alice', 'User is in Pacific timezone (UTC-8)', {
  importance: 0.92,
  tags: ['preference', 'timezone']
});
await storePreference('alice', 'User is senior engineer — skip beginner explanations', {
  importance: 0.88,
  tags: ['preference', 'communication']
});

// Build personalized system prompt at session start
async function buildSystemPrompt(userId: string, taskContext: string): Promise<string> {
  const prefs = await client.recall(`user-${userId}-prefs`, `preferences for: ${taskContext}`, {
    top_k: 6,
    min_importance: 0.8
  });

  if (!prefs.memories.length) return 'You are a helpful assistant.';

  const prefLines = prefs.memories.map(m => `- ${m.content}`).join('
');
  return `You are a personalized assistant. Known user preferences:
${prefLines}

Always apply these without being asked.`;
}

// Example: fetch language-specific preferences using search
async function getLanguagePrefs(userId: string): Promise<string[]> {
  const results = await client.searchMemories(`user-${userId}-prefs`, 'programming language framework preference', { top_k: 3 });
  return results.memories.map(m => m.content);
}

const system = await buildSystemPrompt('alice', 'writing REST API code');

use dakera_rs::{Client, StoreMemoryRequest, RecallRequest};

let client = Client::new("http://localhost:3300", "dk-...");

// Store preferences at high importance
client.store_memory("user-alice-prefs", StoreMemoryRequest {
    content: "User prefers Python over JavaScript for all backend code".into(),
    memory_type: "semantic".into(),
    importance: Some(0.95),
    tags: vec!["preference".into(), "language".into()],
    ..Default::default()
}).await?;

client.store_memory("user-alice-prefs", StoreMemoryRequest {
    content: "User wants concise bullet-point responses, not paragraphs".into(),
    memory_type: "semantic".into(),
    importance: Some(0.90),
    tags: vec!["preference".into(), "format".into()],
    ..Default::default()
}).await?;

client.store_memory("user-alice-prefs", StoreMemoryRequest {
    content: "User is in Pacific timezone (UTC-8), works 9am-6pm PST".into(),
    memory_type: "semantic".into(),
    importance: Some(0.92),
    tags: vec!["preference".into(), "timezone".into()],
    ..Default::default()
}).await?;

// Recall at session start
let prefs = client.recall("user-alice-prefs", RecallRequest {
    query: "coding language style format preferences".into(),
    top_k: Some(5),
    min_importance: Some(0.8),
    ..Default::default()
}).await?;

// Build system prompt from recalled preferences
let pref_text = prefs.memories
    .iter()
    .map(|m| format!("- {}", m.content))
    .collect::<Vec<_>>()
    .join("
");

let system_prompt = format!(
    "You are a personalized assistant. Known user preferences:
{}

Apply these at all times.",
    pref_text
);

package main

import (
    "context"
    "fmt"
    "strings"
    "github.com/dakera-ai/dakera-go"
)

func main() {
    client := dakera.NewClient("http://localhost:3300", "dk-...")
    ctx := context.Background()

    // Store preferences at high importance
    prefs := []struct {
        content    string
        importance float64
        tags       []string
    }{
        {"User prefers Python for backend, TypeScript for frontend", 0.95, []string{"preference", "language"}},
        {"User wants concise bullet-point responses", 0.90, []string{"preference", "format"}},
        {"User is in Pacific timezone (UTC-8)", 0.92, []string{"preference", "timezone"}},
        {"User is senior engineer, skip beginner explanations", 0.88, []string{"preference", "communication"}},
    }

    for _, p := range prefs {
        client.StoreMemory(ctx, "user-alice-prefs", dakera.StoreMemoryRequest{
            Content:    p.content,
            MemoryType: "semantic",
            Importance: p.importance,
            Tags:       p.tags,
        })
    }

    // Recall at session start
    recalled, _ := client.Recall(ctx, "user-alice-prefs", dakera.RecallRequest{
        Query:         "coding language style preferences",
        TopK:          5,
        MinImportance: 0.8,
    })

    lines := make([]string, 0, len(recalled.Memories))
    for _, m := range recalled.Memories {
        lines = append(lines, "- "+m.Content)
    }

    systemPrompt := fmt.Sprintf(
        "You are a personalized assistant. Known preferences:
%s

Apply these without being asked.",
        strings.Join(lines, "
"),
    )
    _ = systemPrompt
}

Ready to make your AI remember every user?

Spin up Dakera in 2 minutes and start storing preferences immediately.

Get Started →

Before & After: Memory State

Here is the concrete difference between a session with and without the User Preference Recall pattern. The left column shows raw agent state (no memory). The right column shows the preference store after 3 interactions.

Before: No Memory

{
  // Agent has no persistent state.
  // Every session starts blank.
  // User must re-state preferences
  // every single conversation.

  agent_id: "user-alice",
  memories: [],

  // Result: frustrating repetition
  // "As I mentioned before..."
  // "I already told you I use Python"
}

After: Preference Store

{
  "agent_id": "user-alice-prefs",
  "memory_count": 5,
  "memories": [
    {
      "content": "Prefers Python for backend",
      "importance": 0.95,
      "tags": ["preference","language"],
      "type": "semantic"
    },
    {
      "content": "Wants bullet-point responses",
      "importance": 0.90,
      "tags": ["preference","format"]
    },
    {
      "content": "Pacific timezone (UTC-8)",
      "importance": 0.92,
      "tags": ["preference","timezone"]
    },
    {
      "content": "Senior engineer, skip basics",
      "importance": 0.88,
      "tags": ["preference","skill"]
    },
    {
      "content": "Prefers async/await patterns",
      "importance": 0.87,
      "tags": ["preference","language"]
    }
  ]
}

Real-World Example: E-Commerce Recommendation Agent

Scenario: ShopFlow AI, a fashion e-commerce platform, deploys an AI shopping assistant. Without preference recall, every product recommendation session asks the same sizing questions. With this pattern, the agent accumulates a preference profile over the customer's first 3 interactions.

Customer Journey: First 3 Sessions

Session 1 (Day 1): Customer says "I usually wear medium in European sizing and I'm looking for sustainable brands." The agent stores: EU size Medium, prefers sustainable fashion brands (importance: 0.92).

Session 2 (Day 3): "Can you only show me items under $150?" stored as: Budget ceiling $150 for clothing (importance: 0.88). Customer mentions they hate synthetic fabrics: Prefers natural fabrics: cotton, linen, wool (importance: 0.91).

Session 3 (Day 10): Customer asks about winter coats. The agent recalls all 4 preferences in <12 ms and instantly filters the catalog to EU Medium, sustainable brands, natural fabrics, under $150. No questions asked. Conversion rate increases 34% compared to the stateless version.

Pro Tip: Preference Confidence Tiers

Use importance scores as confidence tiers. Explicitly stated preferences get 0.9–1.0. Inferred from behavior get 0.7–0.89. Guessed from context get 0.5–0.69. When recalling, set min_importance=0.8 to only include high-confidence facts in the system prompt. Lower confidence preferences can be surfaced as suggestions ("Based on past orders, you might prefer...").

Preference Signal Flow: From Behavior to Personalization

This diagram shows how raw user behavior is converted into structured preference memories that feed back into personalized agent responses.

Performance Characteristics

<15ms

Preference recall p99 latency (top_k=5)

~2KB

Storage per user (50 preferences)

10k+

Concurrent user preference stores supported

Preference recall adds roughly 8–15 ms to session startup — well within an acceptable pre-flight cost. At scale, prefer batching the recall call with your session initialization rather than making it a blocking dependency. A cold agent ID (first recall ever) may take 20–30 ms while the index warms; subsequent calls for the same agent ID are consistently under 10 ms.

Edge Cases & Developer Gotchas

Gotcha 1: Stale Preferences Win Over Fresh Behavior

If a user stored "prefers JavaScript" two months ago and switched to TypeScript last week, the old memory may still outrank newer inferences if both have equal importance. Solution: When detecting a preference change, explicitly call update_memory or update_importance to downrank the old entry rather than just adding the new one. Always store changed preferences with importance 0.05 higher than the previous version.

Gotcha 2: Preference Namespace Contamination

Storing both preferences and conversation history under the same agent ID causes recall to mix them together. A query for "language preference" may return last week's Python debugging chat instead of the stored preference. Solution: Always use a separate agent ID suffix like -prefs for preferences. Keep episodic conversation memories under a different agent ID like user-alice-history.

Gotcha 3: Low top_k Misses Category Diversity

Using top_k=3 for a general "what are this user's preferences?" query may return 3 language preferences and miss timezone and format preferences entirely — because they are all semantically similar. Solution: Use top_k=8 or higher for the session-start recall, or issue multiple targeted recalls by category tag. The storage cost is negligible.

Gotcha 4: Missing Preferences on First Session

For brand-new users, recall will return an empty array. Your code must handle this gracefully — do not crash or inject an empty preferences block into the system prompt. Solution: Always have a default system prompt fallback. Only inject preferences when memories.length > 0. Consider adding a "getting to know you" flow for first-time users that proactively asks for key preferences.

Gotcha 5: Conflicting Preferences From Different Contexts

A user who prefers "detailed explanations" for learning sessions but "bullet points" for production code questions may have conflicting memories. Both are valid — they just apply in different contexts. Solution: Store context metadata with each preference (e.g., tag with context:learning or context:production) and filter by context tag during recall.

SDK Reference

Operation	Python	TypeScript	Purpose
Store preference	`client.store_memory(agent_id, content, importance, memory_type, tags)`	`client.storeMemory(agentId, {content, importance, memoryType, tags})`	Persist a new preference
Recall preferences	`client.recall(agent_id, query, top_k, min_importance)`	`client.recall(agentId, query, {top_k, min_importance})`	Retrieve top preferences by relevance
Search by keyword	`client.search_memories(agent_id, query, top_k)`	`client.searchMemories(agentId, query, {top_k})`	BM25 keyword search in preferences
Update preference	`client.update_memory(agent_id, memory_id, ...)`	`client.updateMemory(agentId, memoryId, request)`	Overwrite a changed preference
Re-rank preference	`client.update_importance(agent_id, memory_id, importance)`	`client.updateImportance(agentId, request)`	Boost or demote a preference's rank
Delete preference	`client.forget(agent_id, memory_id)`	`client.forget(agentId, memoryId)`	Remove a preference (e.g., user revoked it)
Get single preference	`client.get_memory(agent_id, memory_id)`	—	Fetch a specific preference by ID

Advanced Configuration

TTL-Based Preference Expiry

For time-sensitive preferences — like a user's interest in a sale event — use ttl_seconds to auto-expire them. Permanent preferences like language choice should have no TTL.

# Expire sale interest after 7 days
client.store_memory(
    agent_id="user-alice-prefs",
    content="User is actively shopping for winter coats (sale season)",
    importance=0.75,
    memory_type="semantic",
    tags=["preference", "seasonal", "shopping"],
    ttl_seconds=604800  # 7 days
)

Batch Recall for Multi-Agent Personalization

When multiple agents need the same user's preferences simultaneously (e.g., a recommendation agent and a support agent), use batch_recall to fan out the query in a single round-trip:

results = client.batch_recall({
    "queries": [
        {"agent_id": "user-alice-prefs", "query": "product category preferences", "top_k": 5},
        {"agent_id": "user-alice-prefs", "query": "communication style preferences", "top_k": 3},
        {"agent_id": "user-alice-prefs", "query": "budget and pricing preferences", "top_k": 3}
    ]
})
# Returns all 3 result sets in a single response

Preference Namespacing Strategy

For multi-tenant SaaS products, use double-namespacing to keep user preferences isolated:

# Pattern: {tenant}-{userId}-prefs
agent_id = f"tenant-acme-user-alice-prefs"

# Combine with Dakera namespace isolation
# for hard tenant boundaries at the server level

When to Use This Pattern

Personalized AI assistants serving returning users more than once
E-commerce recommendation engines learning taste over time
Customer support bots that remember ticket history and preferences
Coding assistants that remember language, framework, and style choices
Health or wellness apps remembering dietary restrictions and goals
Any product where "the AI remembers you" is a differentiating feature

Build AI that actually remembers your users

Dakera is the memory layer that makes personalization permanent. Ship your first preference-aware agent in under 15 minutes.

Read the Quickstart → API Reference