Page:
Data Flow 01 Context Compaction
Pages
Architecture Commands and API
Architecture Context System
Architecture Core Engine
Architecture Event Sourcing
Architecture Generative Reflection
Architecture Helpers
Architecture Journal System
Architecture LLM Interaction
Architecture LLM Providers
Architecture Logging
Architecture Memory and Sleep
Architecture Overview
Architecture Persona Protection
Architecture Prompt System
Architecture RAG Implementation
Architecture Resilience System
Architecture Safety System
Architecture Self Management
Architecture Sub Agent Delegation
Architecture Task Assessment
Architecture Token Management
Architecture Tool System
Configuration Reference
Context and Memory Flow Analysis
Data Flow 01 Context Compaction
Data Flow 02 ReAct Loop
Data Flow 03 Memory Consolidation
Data Flow 04 Message Classification
Data Flow 05 Entity Profile System
Data Flow 06 Tool Execution
Data Flow 07 Sleep Mode Transitions
Data Flow 08 LLM Provider Interaction
Data Flow 09 Self Management Operations
Home
LLM Decision Patterns
Research Foundations
User Guide 00 Index
User Guide 01 Getting Started
User Guide 02 Configuration and Customization
User Guide 03 Advanced Capabilities
User Guide 04 Troubleshooting
No results
2
Data Flow 01 Context Compaction
blightbow edited this page 2025-12-09 02:39:18 +00:00
Table of Contents
Data Flow 01: Context Compaction
Engineering documentation series - Data flows in the AI Assistant system
Overview
This document describes the data flows involved in context compaction, including:
- Sleep-triggered compaction (70% threshold)
- Emergency compaction (80% threshold)
- Pre-compaction fact extraction
Related Documents
| Document | Description |
|---|---|
| Context and Memory Flow Analysis | Original research and design |
| Data-Flow-02-ReAct-Loop | Tick loop execution |
| Data-Flow-03-Memory-Consolidation | Sleep phase memory operations |
1. Sleep-Triggered Compaction
Occurs during the dreaming phase of sleep mode when context usage exceeds 70%.
Trigger Point
assistant_script.py::at_tick()
└─> _run_sleep_tick()
└─> rag_memory.py::run_sleep_tick()
└─> compact_conversation_history(script, character)
Data Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ SLEEP TICK START │
│ Operating mode: sleep, Phase: dreaming │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ MEMORY OPERATIONS (rag_memory.py::run_sleep_tick) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ 1. Memory link generation (consolidate_memories_to_semantic) │
│ 2. Entity consolidation (helpers/entity_context.py::run_entity_consolidation_batch) │
│ 3. Orphaned memory link cleanup (prune_orphaned_memory_links) │
│ 4. Episodic prune (helpers/episodic_index.py::prune_low_importance_entries) │
│ 5. Stale conversation cleanup (helpers/working_memory.py::clear_stale_conversations) │
│ 6. Pre-compaction fact extraction (run_pre_compaction_extraction) │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPACTION CHECK (compact_conversation_history) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ Input: │
│ script.db.conversation_history (full message list) │
│ script.db.compact_sleep_threshold (0.7 default) │
│ script.db.compact_preserve_window (20 default) │
│ script.db.max_context_tokens (100000 default) │
│ │
│ Check: token_count / max_tokens >= 0.7 │
│ └─> If below threshold: return {skipped: true} │
│ └─> If history <= preserve_window: return {skipped: true} │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼ (threshold exceeded)
┌─────────────────────────────────────────────────────────────────────────────┐
│ SPLIT HISTORY │
│ ─────────────────────────────────────────────────────────────────────────── │
│ messages_to_compact = history[:-20] (older messages) │
│ messages_to_preserve = history[-20:] (recent messages) │
│ │
│ Example with 50 messages: │
│ [msg0, msg1, ... msg29] → TO COMPACT (30 messages) │
│ [msg30, msg31, ... msg49] → TO PRESERVE (20 messages) │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ LLM SUMMARY CALL (generate_context_summary) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ IMPORTANT: This is a SEPARATE LLM call with DIFFERENT context │
│ │
│ Payload sent to LLM: │
│ [system: DEFAULT_COMPACTION_PROMPT] │
│ [user: formatted messages_to_compact only] │
│ │
│ NOT the full conversation_history - only the old messages being summarized │
│ │
│ Returns: Summary text (string) │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ BUFFER REPLACEMENT │
│ ─────────────────────────────────────────────────────────────────────────── │
│ Create compaction marker: │
│ { │
│ role: "system", │
│ content: "[CONTEXT SUMMARY]\n{summary}", │
│ metadata: {type: "compaction", compacted_count: 30, ...} │
│ } │
│ │
│ New history = [compaction_marker] + messages_to_preserve │
│ │
│ Before: 50 messages (~70,000 tokens) │
│ After: 21 messages (~15,000 tokens) = 78% reduction │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ JOURNAL ARCHIVE │
│ ─────────────────────────────────────────────────────────────────────────── │
│ character.db.journal["entries"].append({ │
│ id: "compact_20251206_171500", │
│ content: "[CONTEXT SYNTHESIS]\n{summary}", │
│ source_type: "compaction", │
│ importance: 7, │
│ tags: ["compaction", "synthesis"] │
│ }) │
│ │
│ Purpose: Archive compacted context for future retrieval │
└─────────────────────────────────────────────────────────────────────────────┘
2. Emergency Compaction
Occurs at the start of any tick when context usage exceeds 80%.
Trigger Point
assistant_script.py::at_tick()
└─> Check: token_count / max_tokens >= 0.8
└─> rag_memory.py::run_pre_compaction_extraction(max_iterations=3)
└─> rag_memory.py::compact_conversation_history(script, character, force=True)
Data Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ TICK START (awake mode) │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ EMERGENCY CHECK (before any other processing) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ Conditions: │
│ - operating_mode == "awake" │
│ - compact_enabled == True │
│ - conversation_history not empty │
│ - token_count / max_tokens >= compact_emergency_threshold (0.8) │
│ │
│ If triggered: │
│ logger.log_warn("Emergency compaction triggered: 85% context usage") │
│ yield run_pre_compaction_extraction(max_iterations=3) │
│ yield compact_conversation_history(script, character, force=True) │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PRE-COMPACTION EXTRACTION (limited iterations) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ max_iterations=3 (vs 5 for sleep) to minimize delay │
│ Assistant journals critical facts while context still available │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPACTION (same flow as sleep, but force=True) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ force=True means: │
│ - Uses emergency_threshold (0.8) not sleep_threshold (0.7) │
│ - Skips the "below threshold" check │
│ - Still respects preserve_window │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ NORMAL TICK PROCESSING CONTINUES │
│ ─────────────────────────────────────────────────────────────────────────── │
│ With reduced context, tick can now proceed safely │
└─────────────────────────────────────────────────────────────────────────────┘
3. Pre-Compaction Fact Extraction
Status: Implemented (commits df19f0ad1, 042fa758b)
Allows the assistant to journal important facts while full context is still available, before compaction discards older messages.
Key Insight: Context Timing
┌─────────────────────────────────────────────────────────────────────────────┐
│ WHY THIS WORKS │
│ ─────────────────────────────────────────────────────────────────────────── │
│ │
│ The "journal facts" LLM call and the "compaction summary" LLM call are │
│ SEPARATE calls with DIFFERENT payloads: │
│ │
│ 1. Journal facts call: │
│ - Submits: FULL conversation_history (~80% of context limit) │
│ - Returns: Tool calls to journal facts │
│ │
│ 2. Tool execution (LOCAL, no LLM call): │
│ - Executes tool calls │
│ - Adds results to LOCAL buffer (now ~85%) │
│ - Facts saved to persistent journal │
│ │
│ 3. Compaction summary call: │
│ - Submits: ONLY messages_to_compact (NOT full buffer) │
│ - Buffer size doesn't matter - it's a different payload │
│ │
│ 4. Buffer replacement: │
│ - Replaces history with summary + preserved window │
│ - Buffer now ~25% of limit │
│ │
│ Tool calls from step 2 land in preserved window and age out naturally. │
│ Even if buffer temporarily exceeds 100%, we never submit that buffer. │
└─────────────────────────────────────────────────────────────────────────────┘
Implementation
┌─────────────────────────────────────────────────────────────────────────────┐
│ rag_memory.py::run_pre_compaction_extraction() │
│ ─────────────────────────────────────────────────────────────────────────── │
│ │
│ Entry conditions: │
│ - pre_compact_extraction_enabled == True (default) │
│ - conversation_history has >= 5 messages │
│ │
│ Context type: CONTEXT_PRE_COMPACTION │
│ - Tools: noop, add_journal_entry, update_entity_observation │
│ - Execution mode: react_loop │
│ - Max iterations: 5 (sleep) or 3 (emergency) │
│ │
│ Loop termination: │
│ - LLM returns noop (explicit completion) │
│ - LLM returns no tool call │
│ - LLM requests non-journal tool (stops to prevent scope creep) │
│ - Max iterations reached │
│ │
│ Returns: │
│ { │
│ "success": True, │
│ "facts_recorded": 3, # Number of journal entries created │
│ "iterations": 4 # Loops executed before termination │
│ } │
└─────────────────────────────────────────────────────────────────────────────┘
Data Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPACTION TRIGGER (sleep or emergency) │
│ Context: ~70-80% full │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ PRE-COMPACTION REACT LOOP │
│ ─────────────────────────────────────────────────────────────────────────── │
│ Set context signal: script.db.context_signals["pre_compaction"] = True │
│ │
│ Prompt context type: "pre_compaction" │
│ - Instructs assistant to review context │
│ - Focus on journaling important facts │
│ - Use journal tools (add_journal_entry, update_entity_observation) │
│ │
│ Loop (max 3-5 iterations): │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Iteration 1 │ │
│ │ LLM sees: full context + "extract important facts" prompt │ │
│ │ LLM returns: tool_call(add_journal_entry, "Alice is allergic...") │ │
│ │ Execute: fact saved to journal, result added to buffer │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Iteration 2 │ │
│ │ LLM sees: context + previous tool result │ │
│ │ LLM returns: tool_call(add_journal_entry, "Quest deadline...") │ │
│ │ Execute: fact saved to journal, result added to buffer │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Iteration 3 │ │
│ │ LLM sees: context + previous tool results │ │
│ │ LLM returns: noop (assistant judges extraction complete) │ │
│ │ Exit loop │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ Clear signal: script.db.context_signals["pre_compaction"] = False │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPACTION (existing flow) │
│ ─────────────────────────────────────────────────────────────────────────── │
│ Buffer: ~85% (original + tool call results) │
│ Split: messages_to_compact vs messages_to_preserve │
│ Tool calls from extraction are in preserved window (recent) │
│ Summary LLM call: only sends messages_to_compact │
│ Replacement: buffer drops to ~25% │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ RESULT │
│ ─────────────────────────────────────────────────────────────────────────── │
│ - Important facts preserved in journal (permanent) │
│ - Context summary includes synthesized information │
│ - Full context meaning captured before detail lost │
│ - Buffer ready for next tick cycle │
└─────────────────────────────────────────────────────────────────────────────┘
4. Configuration Reference
| Attribute | Default | Description |
|---|---|---|
compact_enabled |
True |
Master switch for compaction |
compact_sleep_threshold |
0.7 |
Trigger during sleep at 70% |
compact_emergency_threshold |
0.8 |
Trigger emergency at 80% |
compact_preserve_window |
20 |
Messages to keep intact |
compact_model |
None |
Override model for summary (None = main) |
compact_prompt |
None |
Override prompt (None = default) |
max_context_tokens |
100000 |
Context limit for calculations |
pre_compact_extraction_enabled |
True |
Enable pre-compaction fact extraction |
pre_compact_max_iterations |
5 |
Max extraction loop cycles |
pre_compact_prompt |
None |
Custom extraction prompt (None = default) |
5. Key Files
| File | Key Functions | Purpose |
|---|---|---|
assistant_script.py |
Config attributes | Compaction settings |
assistant_script.py |
at_tick() |
Emergency compaction check |
rag_memory.py |
compact_conversation_history() |
Main compaction logic |
rag_memory.py |
run_pre_compaction_extraction() |
Pre-compaction fact extraction |
rag_memory.py |
run_sleep_tick() |
Sleep phase integration |
llm_interaction.py |
generate_context_summary() |
LLM summary generation |
helpers/execution.py |
count_conversation_tokens() |
Token counting |
prompt_contexts.py |
CONTEXT_PRE_COMPACTION |
Pre-compaction context type |
Document created: 2025-12-06 Updated: 2025-12-06 - Pre-compaction extraction implemented Updated: 2025-12-09 - Converted to semantic refs, updated helpers/ package paths Related issue: #10 (closed)