[P1] Implement Conversation History Trimming #10

New issue

Closed

opened 2025-12-05 13:49:13 +00:00 by blightbow · 1 comment

blightbow commented

2025-12-05 13:49:13 +00:00

Owner

Problem

conversation_history has no token-based trimming despite max_history attribute existing. This leads to unbounded growth and potential context overflow.

Suggested Fix

Implement token-aware cleanup, not just count-based
Trim at 80% of max_context_tokens

Priority

P1 — High Priority

Source

Architecture Audit 2025-12-03, Section 2: Memory Leak Potential

## Problem `conversation_history` has no token-based trimming despite `max_history` attribute existing. This leads to unbounded growth and potential context overflow. ## Suggested Fix - Implement token-aware cleanup, not just count-based - Trim at 80% of `max_context_tokens` ## Priority **P1 — High Priority** ## Source Architecture Audit 2025-12-03, Section 2: Memory Leak Potential

blightbow added the

labels

2025-12-05 13:49:13 +00:00

blightbow added this to the Context Compaction System milestone

2025-12-06 06:49:39 +00:00

blightbow closed this issue

2025-12-06 17:11:52 +00:00

blightbow commented

2025-12-06 17:11:53 +00:00

Author

Owner

Resolution Summary

Implemented a two-phase context compaction system based on research from Claude Code, MemGPT, and JetBrains NeurIPS 2025 findings.

Implementation Details

Configuration Attributes (assistant_script.py:161-169):

compact_enabled - Toggle compaction on/off (default: True)
compact_sleep_threshold - Trigger at 70% context usage during sleep
compact_emergency_threshold - Emergency trigger at 80% usage
compact_preserve_window - Keep last 20 messages intact
compact_model - Optional override model for summarization
compact_prompt - Optional custom summarization prompt
last_compaction - Timestamp tracking

Core Functions:

count_conversation_tokens() in helpers.py:452-473 - Token counting helper
generate_context_summary() in llm_interaction.py:317-380 - LLM-based summarization
compact_conversation_history() in rag_memory.py:825-956 - Main compaction logic

Integration Points:

Sleep phase integration in rag_memory.py:1111-1120 - Runs after memory consolidation
Emergency compaction check in assistant_script.py:646-667 - Triggers during awake mode when approaching limits

Key Features:

Two-phase triggers: sleep (ideal path) + emergency (fallback at 80%)
LLM-generated summaries preserve facts before trimming
Journal synthesis entries archive compacted context
Configurable model for summarization (defaults to main LLM)
Metadata preserved in compaction markers

Testing

14 unit tests in test_context_compaction.py
Tests cover: skip conditions, execution, journal entries, summary generation, emergency thresholds

Documentation

Wiki page: Context and Memory Flow Analysis
Milestone: "Context Compaction System" contains the implementation plan

## Resolution Summary Implemented a two-phase context compaction system based on research from Claude Code, MemGPT, and JetBrains NeurIPS 2025 findings. ### Implementation Details **Configuration Attributes** (`assistant_script.py:161-169`): - `compact_enabled` - Toggle compaction on/off (default: True) - `compact_sleep_threshold` - Trigger at 70% context usage during sleep - `compact_emergency_threshold` - Emergency trigger at 80% usage - `compact_preserve_window` - Keep last 20 messages intact - `compact_model` - Optional override model for summarization - `compact_prompt` - Optional custom summarization prompt - `last_compaction` - Timestamp tracking **Core Functions**: - `count_conversation_tokens()` in `helpers.py:452-473` - Token counting helper - `generate_context_summary()` in `llm_interaction.py:317-380` - LLM-based summarization - `compact_conversation_history()` in `rag_memory.py:825-956` - Main compaction logic **Integration Points**: - Sleep phase integration in `rag_memory.py:1111-1120` - Runs after memory consolidation - Emergency compaction check in `assistant_script.py:646-667` - Triggers during awake mode when approaching limits **Key Features**: - Two-phase triggers: sleep (ideal path) + emergency (fallback at 80%) - LLM-generated summaries preserve facts before trimming - Journal synthesis entries archive compacted context - Configurable model for summarization (defaults to main LLM) - Metadata preserved in compaction markers ### Testing - 14 unit tests in `test_context_compaction.py` - Tests cover: skip conditions, execution, journal entries, summary generation, emergency thresholds ### Documentation - Wiki page: [Context and Memory Flow Analysis](https://forge.wordpainter.net/blightbow/evennia_ai/wiki/Context-and-Memory-Flow-Analysis) - Milestone: "Context Compaction System" contains the implementation plan

blightbow referenced this issue from a commit

2025-12-08 07:35:37 +00:00

Add context compaction helpers and summary generation