Feedback Loop Memory
Every user correction is a free training signal — but most agents throw it away. When a user says "no, don't use bullet points" or "actually I prefer UK English", that preference should persist across every future session. This pattern closes the feedback loop: corrections become permanent memories that prevent repeated mistakes and accumulate into a personalized behavioral profile.
Start Building Free →- Dakera instance running (quickstart)
- SDK installed:
pip install dakera/npm i @dakera-ai/dakera - Correction detection logic — pattern matching on "no", "actually", "don't", "I prefer", "stop doing", explicit reformatting requests
- A domain tagging strategy so corrections are scoped correctly and don't bleed across unrelated domains
The Problem
Without feedback memory, agents are Sisyphean — they learn nothing across sessions. Three failure modes compound over time:
- Repeated style violations: A user corrects "please don't use emoji in technical docs" once, twice, three times. Without persistent storage, the agent reverts to its defaults every session. By the fifth correction, the user has given up and the product has lost their trust.
- Lost domain calibration: A technical writer corrects how the agent structures API documentation. A legal researcher corrects citation format. A marketer corrects brand voice. These are domain-specific lessons that should persist indefinitely, but evaporate with each new session.
- No positive reinforcement: When the agent does something well and the user says "yes, exactly like that — do it this way every time," there's nowhere to store that positive signal. The agent can't consistently replicate what worked.
Architecture
This pattern stores every detected correction as a lesson memory with very high importance (0.92+). Lessons are tagged by domain so they can be recalled selectively. Before generating any response, the agent recalls lessons relevant to the current domain and injects them as behavioral constraints into the system prompt. Positive feedback boosts existing memory importance; negative feedback creates new lesson entries.
Feedback signal flow: negative corrections create new lesson memories; positive feedback boosts importance of existing memories. Both feed the recall loop for future responses in the same domain.
Memory correction loop: lesson memories are recalled before every response, filtered to relevant domain, and injected as behavioral constraints in the system prompt. The agent cannot violate stored lessons.
Implementation Steps
-
Build a correction detectorScan each user message for correction signals: negation words ("no", "don't", "stop", "never", "please avoid"), reformulation phrases ("actually", "I meant", "I prefer", "instead use"), and explicit instruction phrases ("always do X", "from now on", "every time you"). Classify detected corrections by domain (formatting, tone, language, code style, factual accuracy).
-
Store lessons with very high importance and domain tagsCorrections get importance 0.92–0.96 — the highest of any memory type. They must surface in recall every single time. Include both the wrong behavior and the correct behavior in the content: "LESSON: Do NOT use bullet points when writing blog posts. Use flowing prose paragraphs instead." Tag with the domain:
domain:formatting,domain:language,domain:code. -
Detect positive signals and boost importanceWhen the user says "exactly like that", "perfect", "yes, this is what I wanted", detect the positive signal and use
update_importance()to increase the importance of the most recent memory in that domain by 0.05. This reinforces memories that describe behaviors the user loves, making them more likely to surface consistently. -
Recall domain-relevant lessons before every responseBefore generating any response, identify the domain of the current task (writing, coding, analysis, etc.) and recall the top 8 lessons for that domain. Use
min_importance=0.85to filter out weak signals. Inject all recalled lessons as a numbered constraint list in the system prompt: "User behavioral rules: 1. Never use bullet points..." -
Periodically consolidate contradicting lessonsOver time, users may give contradictory instructions (prefer bullets for summaries but not for articles). Detect contradictions at storage time by recalling existing lessons in the same domain and comparing. When a new lesson directly contradicts an existing one, reduce the old lesson's importance to 0.50 and tag it as
superseded. Store both so history is preserved.
A lesson memory needs both sides of the correction to be actionable: "LESSON: Do NOT write bullet point lists for blog content. Always use continuous prose paragraphs with clear topic sentences." A lesson that only says "no bullet points" gives the model no positive direction and leads to inconsistent behavior. The more specific the positive instruction, the more reliable the correction.
Implementation
# Store a lesson from user correction
curl -X POST http://localhost:3300/v1/memory/store \
-H "Authorization: Bearer dk-..." \
-H "Content-Type: application/json" \
-d '{
"agent_id": "writing-assistant-user-1",
"content": "LESSON: Do NOT use bullet point lists when writing blog posts or articles. Always use flowing prose paragraphs with clear topic sentences instead.",
"importance": 0.94,
"memory_type": "semantic",
"tags": ["type:lesson", "domain:formatting", "feedback:negative"],
"metadata": {
"correction_type": "formatting",
"wrong_behavior": "bullet point lists",
"correct_behavior": "prose paragraphs",
"trigger": "user_correction"
}
}'
# Store a positive lesson (user loved a specific behavior)
curl -X POST http://localhost:3300/v1/memory/store \
-H "Authorization: Bearer dk-..." \
-H "Content-Type: application/json" \
-d '{
"agent_id": "writing-assistant-user-1",
"content": "LESSON: User prefers Oxford comma usage. Always include Oxford comma in lists. User explicitly confirmed: '''exactly like this every time'''.",
"importance": 0.91,
"memory_type": "semantic",
"tags": ["type:lesson", "domain:language", "feedback:positive"],
"metadata": {"correction_type": "punctuation", "correct_behavior": "oxford_comma"}
}'
# Boost importance after positive feedback on a recalled memory
curl -X POST http://localhost:3300/v1/memory/mem_abc123/importance \
-H "Authorization: Bearer dk-..." \
-H "Content-Type: application/json" \
-d '{"agent_id": "writing-assistant-user-1", "importance": 0.96}'
# Recall domain lessons before responding
curl "http://localhost:3300/v1/memory/recall?agent_id=writing-assistant-user-1&query=lessons+corrections+formatting+writing+style+rules&top_k=8&min_importance=0.85" \
-H "Authorization: Bearer dk-..."import re
from dakera import DakeraClient
from typing import Optional
client = DakeraClient(base_url="http://localhost:3300", api_key="dk-...")
CORRECTION_PATTERNS = [
r"\bdon'?t\b.*\b(use|write|include|add|put)\b",
r"\b(never|stop|no more|avoid)\b.*\b(using|writing|adding|including)\b",
r"\b(please|always)\b.*\binstead\b",
r"\bactually,?\b",
r"\bi prefer\b",
r"\bfrom now on\b",
r"\bevery time\b",
r"\balways\b.*\b(use|do|write|format)\b",
]
POSITIVE_PATTERNS = [
r"\bexactly (like|right|that|what i)\b",
r"\b(perfect|yes,? that'?s?|love this|great job)\b",
r"\bdo it this way\b",
r"\bkeep (doing|it like)\b",
]
DOMAIN_KEYWORDS = {
"formatting": ["bullet", "list", "header", "section", "paragraph", "bold", "italic", "indent", "table"],
"language": ["english", "british", "american", "spelling", "grammar", "comma", "apostrophe", "punctuation"],
"tone": ["formal", "casual", "technical", "friendly", "professional", "warm", "direct", "concise"],
"code": ["typescript", "python", "javascript", "function", "async", "class", "variable", "comment"],
"structure": ["introduction", "conclusion", "outline", "summary", "example", "reference", "citation"],
}
def detect_correction(message: str) -> Optional[dict]:
"""Returns correction info if the message contains a correction signal."""
msg_lower = message.lower()
for pattern in CORRECTION_PATTERNS:
if re.search(pattern, msg_lower):
domain = classify_domain(msg_lower)
return {"type": "correction", "domain": domain, "message": message}
return None
def detect_positive_feedback(message: str) -> bool:
msg_lower = message.lower()
return any(re.search(p, msg_lower) for p in POSITIVE_PATTERNS)
def classify_domain(message: str) -> str:
message_lower = message.lower()
scores = {domain: 0 for domain in DOMAIN_KEYWORDS}
for domain, keywords in DOMAIN_KEYWORDS.items():
scores[domain] = sum(1 for kw in keywords if kw in message_lower)
return max(scores, key=scores.get) if any(scores.values()) else "general"
def store_lesson(agent_id: str, wrong_behavior: str, correct_behavior: str, domain: str) -> None:
"""Store a user correction as a permanent behavioral lesson."""
content = (
f"LESSON: Do NOT {wrong_behavior} when writing. "
f"Always {correct_behavior} instead."
)
client.store_memory(
agent_id=agent_id,
content=content,
importance=0.94, # Very high -- lessons must always surface
memory_type="semantic",
tags=["type:lesson", f"domain:{domain}", "feedback:negative"]
)
def store_positive_lesson(agent_id: str, behavior: str, domain: str) -> None:
"""Store a preferred behavior confirmed by user positive feedback."""
client.store_memory(
agent_id=agent_id,
content=f"LESSON: User confirmed they love this: {behavior}. Always do this.",
importance=0.91,
memory_type="semantic",
tags=["type:lesson", f"domain:{domain}", "feedback:positive"]
)
def boost_lesson_importance(agent_id: str, memory_id: str) -> None:
"""Increase importance of a lesson the user positively reinforced."""
client.update_importance(
agent_id=agent_id,
memory_id=memory_id,
importance=0.96
)
def get_lessons_for_domain(agent_id: str, domain: str) -> list[str]:
"""Recall all lessons relevant to the current domain."""
result = client.recall(
agent_id=agent_id,
query=f"lessons corrections rules for {domain} writing style behavior",
top_k=8,
min_importance=0.85
)
return [
mem["content"] for mem in result.get("memories", [])
if "LESSON:" in mem["content"]
]
def build_behavioral_constraints(agent_id: str, task_domain: str) -> str:
"""Build a system prompt section from recalled lessons."""
lessons = get_lessons_for_domain(agent_id, task_domain)
if not lessons:
return ""
constraint_block = "User behavioral rules (MUST follow):
"
for i, lesson in enumerate(lessons, 1):
# Strip the LESSON: prefix for cleaner prompt injection
rule = lesson.replace("LESSON: ", "").strip()
constraint_block += f"{i}. {rule}
"
return constraint_block
# --- Full writing assistant workflow ---
def process_user_message(agent_id: str, message: str, task_domain: str = "general") -> str:
# 1. Check for feedback signals
correction = detect_correction(message)
is_positive = detect_positive_feedback(message)
if correction:
# Parse wrong/correct from message (simplified -- use LLM extraction in production)
store_lesson(agent_id, "use bullet points", "use prose paragraphs", correction["domain"])
return "Got it — I'll remember to always use prose paragraphs going forward."
if is_positive:
# Boost the most recently recalled memory
# In production: pass the memory_id from the last recall call
return "Great, I'll keep doing it this way!"
# 2. Build constraints from recalled lessons
constraints = build_behavioral_constraints(agent_id, task_domain)
# 3. Inject into system prompt for your LLM call
# system_prompt = f"{BASE_SYSTEM_PROMPT}
{constraints}"
# response = llm.complete(system=system_prompt, user=message)
return f"[Constraints active]
{constraints}"
# --- Example ---
agent = "writing-assistant-user-1"
store_lesson(agent, "use bullet points in blog articles", "use flowing prose paragraphs with clear topic sentences", "formatting")
store_lesson(agent, "use American English spellings", "use British English (colour, realise, etc.)", "language")
print(build_behavioral_constraints(agent, "formatting"))import { DakeraClient } from '@dakera-ai/dakera';
const client = new DakeraClient({ baseUrl: 'http://localhost:3300', apiKey: 'dk-...' });
const CORRECTION_PATTERNS = [
/don'?t.*(use|write|include|add)/i,
/b(never|stop|avoid|no more).*b(using|writing|adding)/i,
/b(always|please).*instead/i,
/actually,?/i,
/i prefer/i,
/from now on/i,
];
const POSITIVE_PATTERNS = [
/exactly (like|right|that)/i,
/b(perfect|love this|great job)/i,
/do it this way/i,
];
function detectCorrection(message: string): boolean {
return CORRECTION_PATTERNS.some(p => p.test(message));
}
function detectPositiveFeedback(message: string): boolean {
return POSITIVE_PATTERNS.some(p => p.test(message));
}
async function storeLesson(
agentId: string,
wrongBehavior: string,
correctBehavior: string,
domain: string
): Promise<void> {
await client.storeMemory(agentId, {
content: `LESSON: Do NOT ${wrongBehavior}. Always ${correctBehavior} instead.`,
importance: 0.94,
memoryType: 'semantic',
tags: ['type:lesson', `domain:${domain}`, 'feedback:negative'],
});
}
async function storePositiveLesson(agentId: string, behavior: string, domain: string): Promise<void> {
await client.storeMemory(agentId, {
content: `LESSON: User confirmed they prefer: ${behavior}. Always do this.`,
importance: 0.91,
memoryType: 'semantic',
tags: ['type:lesson', `domain:${domain}`, 'feedback:positive'],
});
}
async function boostLessonImportance(agentId: string, memoryId: string): Promise<void> {
await client.updateImportance(agentId, { memoryId, importance: 0.96 });
}
async function getConstraintsForDomain(agentId: string, domain: string): Promise<string> {
const result = await client.recall(
agentId,
`lessons corrections rules for ${domain} writing style`,
{ top_k: 8, min_importance: 0.85 }
);
const lessons = result.memories
.filter(m => m.content.startsWith('LESSON:'))
.map(m => m.content.replace('LESSON: ', '').trim());
if (lessons.length === 0) return '';
return 'User behavioral rules (MUST follow):
' +
lessons.map((l, i) => `${i + 1}. ${l}`).join('
');
}
// Usage
await storeLesson('writing-assistant-user-1', 'use bullet points in blog content', 'use prose paragraphs', 'formatting');
const constraints = await getConstraintsForDomain('writing-assistant-user-1', 'formatting');
console.log(constraints);use dakera_rs::{Client, StoreMemoryRequest, RecallRequest};
let client = Client::new("http://localhost:3300", "dk-...");
// Store a lesson from user correction
client.store_memory("writing-assistant-user-1", StoreMemoryRequest {
content: "LESSON: Do NOT use bullet point lists for blog posts. Always use flowing prose paragraphs with clear topic sentences instead.".into(),
importance: Some(0.94),
memory_type: "semantic".into(),
tags: vec!["type:lesson".into(), "domain:formatting".into(), "feedback:negative".into()],
..Default::default()
}).await?;
// Recall lessons for domain before responding
let lessons = client.recall("writing-assistant-user-1", RecallRequest {
query: "lessons corrections rules for formatting writing style".into(),
top_k: Some(8),
min_importance: Some(0.85),
..Default::default()
}).await?;
// Build constraint block for system prompt
let constraints: String = lessons.memories.iter()
.filter(|m| m.content.contains("LESSON:"))
.enumerate()
.map(|(i, m)| format!("{}. {}", i + 1, m.content.replace("LESSON: ", "")))
.collect::<Vec<_>>()
.join("
");
println!("User behavioral rules:
{}", constraints);
// Boost importance after positive feedback
client.forget("writing-assistant-user-1", "mem_abc123").await?; // example: remove outdated lessonclient := dakera.NewClient("http://localhost:3300", "dk-...")
ctx := context.Background()
// Store a lesson from correction
client.StoreMemory(ctx, "writing-assistant-user-1", dakera.StoreMemoryRequest{
Content: "LESSON: Do NOT use bullet points for blog articles. Always use prose paragraphs with clear topic sentences instead.",
Importance: 0.94,
MemoryType: "semantic",
Tags: []string{"type:lesson", "domain:formatting", "feedback:negative"},
Metadata: map[string]interface{}{
"wrong_behavior": "bullet_points",
"correct_behavior": "prose_paragraphs",
"domain": "formatting",
},
})
// Recall lessons before responding
lessons, _ := client.Recall(ctx, "writing-assistant-user-1", dakera.RecallRequest{
Query: "lessons corrections rules formatting writing style",
TopK: 8,
MinImportance: 0.85,
})
// Build constraint block
var constraints []string
for i, mem := range lessons.Memories {
if strings.HasPrefix(mem.Content, "LESSON:") {
rule := strings.TrimPrefix(mem.Content, "LESSON: ")
constraints = append(constraints, fmt.Sprintf("%d. %s", i+1, rule))
}
}
constraintBlock := "User behavioral rules:
" + strings.Join(constraints, "
")
fmt.Println(constraintBlock)Turn every correction into permanent improvement
Dakera memory closes the feedback loop — your agent learns from every interaction and never makes the same mistake twice.
Real-World Scenario: Writing Assistant Learning User Style
A writing assistant serves a technical author who writes developer documentation. Over 30 sessions, the agent accumulates a personalized style profile through feedback loop memory:
- Session 1: Agent uses emoji in a technical article. User corrects: "Please no emoji in technical content." Lesson stored: importance 0.94, domain: formatting.
- Session 3: Agent uses American spellings. User corrects: "I write British English — 'colour', 'realise'." Lesson stored: importance 0.94, domain: language.
- Session 7: Agent writes code examples in Python. User: "I prefer TypeScript examples for this audience." Lesson stored, domain: code.
- Session 12: Agent writes an intro paragraph in passive voice. User says "exactly like that — passive voice works here." Agent boosts that lesson's importance.
- Session 30: Agent has 11 active lessons spanning formatting, language, code, and structure. Every response is pre-screened against this behavioral profile before it's generated. Zero violations in sessions 20–30. User satisfaction score: 4.9/5.
Before / After Memory State
// Session 15:
User: "Please, for the 6th
time, don't use bullet
points in my articles!"
// Agent has no record of
// previous 5 corrections.
// Defaults to bullet points
// because model training
// prefers them.
// User considering abandoning
// the product.
// Memory store:
{
"content": "LESSON: Do NOT use bullet points. Always use prose paragraphs.",
"importance": 0.94,
"tags": ["type:lesson", "domain:formatting"]
}
// Injected into every prompt:
// "1. Never use bullet points.
// 2. Always UK English.
// 3. Prefer TypeScript code."
// Session 15: zero violations.
// User satisfaction: 4.9/5.
SDK Method Reference
| Method | SDK | Purpose in this pattern |
|---|---|---|
store_memory(agent_id, content, importance, memory_type, tags) | Python | Persist lesson with very high importance |
update_importance(agent_id, memory_id, importance) | Python | Boost lesson importance after positive feedback |
recall(agent_id, query, top_k, min_importance) | Python | Retrieve lessons for domain before responding |
storeMemory(agentId, {content, importance, memoryType, tags}) | TypeScript | Store lesson from user correction |
updateImportance(agentId, {memoryId, importance}) | TypeScript | Boost importance after positive signal |
recall(agentId, query, {top_k, min_importance}) | TypeScript | Recall domain-specific lessons |
client.store_memory("agent", StoreMemoryRequest{...}).await? | Rust | Async lesson storage with domain tags |
client.recall("agent", RecallRequest{...}).await? | Rust | Async lesson recall with min_importance |
client.StoreMemory(ctx, ...) | Go | Store lesson with metadata |
client.Recall(ctx, ...) | Go | Recall lessons for system prompt injection |
Edge Cases and Gotchas
- Lesson contradictions: A user asks for concise responses in one session and detailed explanations in another — for different tasks. Domain-scoping helps but isn't enough. Include context in lesson content: "LESSON: When writing API documentation, be concise. When writing tutorials, include detailed step-by-step examples." Specificity prevents false conflicts.
- Lesson accumulation saturation: After 50+ lessons, recall starts returning irrelevant content. Consolidate lessons quarterly: use the LLM to merge related lessons, remove superseded ones, and re-store consolidated rules. Use
batch_forget()to clear old lessons after consolidation. - Over-constraining the model: A list of 15 behavioral rules injected as constraints can reduce response quality by limiting the model's flexibility. Cap lesson injection at 8–10 rules per response. For contradicting rules, always use the most recently confirmed one.
- Ambiguous corrections: "No, that's not what I meant" is a correction signal but doesn't specify what was wrong. Don't store vague corrections as lessons. Only store lessons where you can clearly identify both the wrong behavior and the desired correct behavior — ideally extracted by asking the LLM to interpret the correction.
- Multi-user shared agents: If multiple users share an agent instance (e.g., a team assistant), lessons from one user will affect others. Always scope the agent_id to the individual user:
"writing-assistant-{user_id}". Never share lesson memory across users.
"Write this in bullet points" (task instruction for this response) is fundamentally different from "I always prefer bullet points" (persistent preference). Don't store task-level instructions as lessons — they'll incorrectly constrain future responses in different contexts. Only store behavioral rules the user explicitly intends to be persistent ("from now on", "always", "every time").
Performance Considerations
Advanced Configuration: Lesson versioning and contradiction detection
Automatically detect when a new lesson contradicts an existing one and handle the conflict gracefully:
async def store_lesson_with_dedup(agent_id: str, lesson: str, domain: str) -> None:
# 1. Recall existing lessons in same domain
existing = client.recall(
agent_id=agent_id,
query=f"lessons {domain} writing rules",
top_k=5,
min_importance=0.85
)
# 2. Ask LLM to detect contradiction (simplified example)
for mem in existing.get("memories", []):
if is_contradicting(lesson, mem["content"]):
# Supersede the old lesson
client.update_importance(
agent_id=agent_id,
memory_id=mem["id"],
importance=0.50 # Demote old lesson below recall threshold
)
# Tag as superseded
client.update_memory(
agent_id=agent_id,
memory_id=mem["id"],
tags=["type:lesson", f"domain:{domain}", "status:superseded"]
)
# 3. Store new lesson as definitive
client.store_memory(
agent_id=agent_id,
content=f"LESSON: {lesson}",
importance=0.94,
memory_type="semantic",
tags=["type:lesson", f"domain:{domain}"]
)
Build agents that learn from every correction
Close the feedback loop with Dakera memory. Every user correction becomes a permanent improvement — stored once, applied forever.
Start Building Free →