[P1] Implement Conversation History Trimming #10

Closed
opened 2025-12-05 13:49:13 +00:00 by blightbow · 1 comment
Owner

Problem

conversation_history has no token-based trimming despite max_history attribute existing. This leads to unbounded growth and potential context overflow.

Suggested Fix

  • Implement token-aware cleanup, not just count-based
  • Trim at 80% of max_context_tokens

Priority

P1 — High Priority

Source

Architecture Audit 2025-12-03, Section 2: Memory Leak Potential

## Problem `conversation_history` has no token-based trimming despite `max_history` attribute existing. This leads to unbounded growth and potential context overflow. ## Suggested Fix - Implement token-aware cleanup, not just count-based - Trim at 80% of `max_context_tokens` ## Priority **P1 — High Priority** ## Source Architecture Audit 2025-12-03, Section 2: Memory Leak Potential
Author
Owner

Resolution Summary

Implemented a two-phase context compaction system based on research from Claude Code, MemGPT, and JetBrains NeurIPS 2025 findings.

Implementation Details

Configuration Attributes (assistant_script.py:161-169):

  • compact_enabled - Toggle compaction on/off (default: True)
  • compact_sleep_threshold - Trigger at 70% context usage during sleep
  • compact_emergency_threshold - Emergency trigger at 80% usage
  • compact_preserve_window - Keep last 20 messages intact
  • compact_model - Optional override model for summarization
  • compact_prompt - Optional custom summarization prompt
  • last_compaction - Timestamp tracking

Core Functions:

  • count_conversation_tokens() in helpers.py:452-473 - Token counting helper
  • generate_context_summary() in llm_interaction.py:317-380 - LLM-based summarization
  • compact_conversation_history() in rag_memory.py:825-956 - Main compaction logic

Integration Points:

  • Sleep phase integration in rag_memory.py:1111-1120 - Runs after memory consolidation
  • Emergency compaction check in assistant_script.py:646-667 - Triggers during awake mode when approaching limits

Key Features:

  • Two-phase triggers: sleep (ideal path) + emergency (fallback at 80%)
  • LLM-generated summaries preserve facts before trimming
  • Journal synthesis entries archive compacted context
  • Configurable model for summarization (defaults to main LLM)
  • Metadata preserved in compaction markers

Testing

  • 14 unit tests in test_context_compaction.py
  • Tests cover: skip conditions, execution, journal entries, summary generation, emergency thresholds

Documentation

## Resolution Summary Implemented a two-phase context compaction system based on research from Claude Code, MemGPT, and JetBrains NeurIPS 2025 findings. ### Implementation Details **Configuration Attributes** (`assistant_script.py:161-169`): - `compact_enabled` - Toggle compaction on/off (default: True) - `compact_sleep_threshold` - Trigger at 70% context usage during sleep - `compact_emergency_threshold` - Emergency trigger at 80% usage - `compact_preserve_window` - Keep last 20 messages intact - `compact_model` - Optional override model for summarization - `compact_prompt` - Optional custom summarization prompt - `last_compaction` - Timestamp tracking **Core Functions**: - `count_conversation_tokens()` in `helpers.py:452-473` - Token counting helper - `generate_context_summary()` in `llm_interaction.py:317-380` - LLM-based summarization - `compact_conversation_history()` in `rag_memory.py:825-956` - Main compaction logic **Integration Points**: - Sleep phase integration in `rag_memory.py:1111-1120` - Runs after memory consolidation - Emergency compaction check in `assistant_script.py:646-667` - Triggers during awake mode when approaching limits **Key Features**: - Two-phase triggers: sleep (ideal path) + emergency (fallback at 80%) - LLM-generated summaries preserve facts before trimming - Journal synthesis entries archive compacted context - Configurable model for summarization (defaults to main LLM) - Metadata preserved in compaction markers ### Testing - 14 unit tests in `test_context_compaction.py` - Tests cover: skip conditions, execution, journal entries, summary generation, emergency thresholds ### Documentation - Wiki page: [Context and Memory Flow Analysis](https://forge.wordpainter.net/blightbow/evennia_ai/wiki/Context-and-Memory-Flow-Analysis) - Milestone: "Context Compaction System" contains the implementation plan
Sign in to join this conversation.
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
blightbow/evennia_ai#10
No description provided.