Architecture: Task Assessment
Layer 3 - LLM-Driven Complexity Evaluation
Overview
Task assessment is the third layer of the three-layer execution pattern system. It provides runtime complexity evaluation that can only restrict (never expand) what Layers 1 and 2 allow:
Layer 1 (Static Config) ⊇ Layer 2 (Context Config) ⊇ Layer 3 (LLM Assessment)
Features:
- LLM-driven assessment - Full complexity analysis via prompt
- Heuristic fallback - Fast path without LLM call
- Event classification - Categorizes pending events
- Context signals - Token pressure, errors, goal urgency
1. Assessment Output
Both LLM and heuristic assessment return the same schema:
{
"complexity_score": int, # 1-5 scale
"recommended_mode": str, # "single_action" | "react_loop" | None
"recommended_iterations": int, # Max iterations for react_loop
"recommend_confirm_dangerous": bool,
"reasoning": str, # Explanation
"context_signals": dict, # Runtime signals (heuristic only)
}
Complexity Score Mapping
| Score | Meaning | Default Mode | Iterations |
|---|---|---|---|
| 1-2 | Simple task | single_action | 1 |
| 3-4 | Moderate complexity | react_loop | 3-5 |
| 5 | Complex multi-step | react_loop | max |
2. Event Classification
Events are classified to inform pattern selection (task_assessment.py).
Classification Categories
| Category | Keywords | Event Types | Pattern Impact |
|---|---|---|---|
communication |
say, page, whisper, reply | say, page, whisper, channel | single_action (await response) |
building |
spawn, create, build, destroy | build, spawn, create | react_loop + confirm_dangerous |
observation |
look, examine, inspect | look, examine | single_action |
movement |
go, move, walk, enter | move, travel | single_action |
query |
what, who, where, how, ? | query, help | react_loop (2 iterations) |
goal_action |
goal, task, complete | goal, task | react_loop |
unknown |
— | — | Default handling |
Classification Function
from evennia.contrib.base_systems.ai.task_assessment import classify_event_content
event = {"type": "say", "message": "Hello there!"}
category = classify_event_content(event) # → "communication"
For multiple events:
from evennia.contrib.base_systems.ai.task_assessment import classify_events
events = [event1, event2, event3]
dominant_class, class_counts = classify_events(events)
# dominant_class = "communication"
# class_counts = {"communication": 2, "query": 1}
3. Context Signals
Runtime context information for pattern selection.
from evennia.contrib.base_systems.ai.task_assessment import build_context_signals
signals = build_context_signals(script)
# {
# "event_class": "communication",
# "event_class_counts": {"communication": 2},
# "token_pressure": "medium",
# "token_usage_pct": 45.2,
# "recent_errors": 0,
# "goal_urgency": "high",
# }
Token Pressure Levels
| Level | Usage | Effect |
|---|---|---|
low |
< 40% | Normal operation |
medium |
40-60% | No restrictions |
high |
60-80% | Reduced iterations |
critical |
> 80% | Forced single_action |
Error Recovery
| Consecutive Errors | Effect |
|---|---|
| 2+ | Reduced iterations |
| 3+ | Forced single_action + confirm_dangerous |
Goal Urgency
| Priority | Effect |
|---|---|
critical, high |
+1 iteration (if not under pressure) |
medium, low, none |
No modifier |
4. LLM Assessment
Full complexity analysis via LLM prompt.
Usage
from evennia.contrib.base_systems.ai.task_assessment import assess_task_complexity
@inlineCallbacks
def example():
# Only runs if execution_config.task_assessment_enabled = True
assessment = yield assess_task_complexity(script)
if assessment:
# Use with select_execution_pattern()
pattern = select_execution_pattern(static, context, assessment)
Assessment Prompt
The LLM receives:
- Operating mode, token usage %
- Recent tools used
- Conversation history size
- Pending events (up to 5)
- Current goals (up to 5)
And returns JSON with complexity_score (1-5) plus recommendations.
Enabling LLM Assessment
script.db.execution_config = {
"task_assessment_enabled": True, # Enable Layer 3
...
}
5. Heuristic Assessment
Fast path without LLM call.
Usage
from evennia.contrib.base_systems.ai.task_assessment import (
get_quick_assessment,
get_quick_assessment_for_script,
)
# Direct call with events/goals
assessment = get_quick_assessment(pending_events, current_goals, context_signals)
# Convenience wrapper using script
assessment = get_quick_assessment_for_script(script)
Heuristic Rules
| Condition | Complexity | Mode | Iterations |
|---|---|---|---|
| Communication event | 1 | single_action | 1 |
| Building event | 4 | react_loop | 3 |
| Observation event | 1 | single_action | 1 |
| Query event | 2 | react_loop | 2 |
| Autonomous (no events/goals) | 1 | single_action | 1 |
| Goal pursuit | 3+ | react_loop | goals+2 |
| Default events | 2+ | varies | events+1 |
Context signal adjustments are applied after base pattern selection.
6. Assessment Flow
Tick Event
│
▼
Check: task_assessment_enabled?
│
├── YES ─────────────────────────────────────┐
│ │
│ assess_task_complexity(script) │
│ │ │
│ ├── Build context summary │
│ ├── Format assessment prompt │
│ ├── Call LLM │
│ ├── Parse JSON response │
│ └── Apply complexity heuristics │
│ │
│ Returns: assessment dict │
│ │
└── NO ──────────────────────────────────────┐
│
get_quick_assessment_for_script(script) │
│ │
├── build_context_signals() │
├── classify_events() │
├── Apply base rules per event_class │
└── Apply context signal adjustments │
│
Returns: assessment dict │
│
▼
select_execution_pattern(static, context, assessment)
│
▼
Execute with selected pattern
7. Integration with Execution Patterns
Assessment feeds into pattern selection:
from evennia.contrib.base_systems.ai.prompt_contexts import select_execution_pattern
# Layer 1: Static config (from execution_config)
# Layer 2: Context config (from context type)
# Layer 3: Assessment (from LLM or heuristics)
effective_pattern = select_execution_pattern(
static_config,
context_config,
assessment # Can only restrict, never expand
)
Key Files
| File | Lines | Purpose |
|---|---|---|
task_assessment.py |
44-114 | Event classification |
task_assessment.py |
162-310 | Context signal calculation |
task_assessment.py |
314-427 | Assessment prompt and context building |
task_assessment.py |
429-518 | Response parsing and heuristics |
task_assessment.py |
520-584 | assess_task_complexity() main function |
task_assessment.py |
587-726 | get_quick_assessment() heuristic path |
See also: Architecture-Context-System | Architecture-Core-Engine | Data-Flow-02-ReAct-Loop