Architecture: Safety System
Infrastructure - Emergency Stop and Error Recovery
Overview
The safety system provides automatic halting when the assistant encounters sustained failures:
- Emergency stop - Halts all operations until manually cleared
- Consecutive error tracking - Triggers stop after threshold exceeded
- Recovery mechanisms - Manual clear via command or API
1. Emergency Stop Mechanism
Trigger Conditions
Emergency stop activates when consecutive errors reach the threshold (emergency.py, assistant_script.py):
# Default configuration (assistant_script.py:190-192)
self.db.emergency_stop = False
self.db.max_consecutive_errors = 5
self.db.consecutive_errors = 0
Trigger Flow
Tick Execution
│
▼
Tool/LLM call fails
│
▼
consecutive_errors += 1
│
▼
Check: consecutive_errors >= max_consecutive_errors?
│
├── NO: Continue tick loop
│
└── YES: trigger_emergency_stop()
│
├── Set db.emergency_stop = True
├── Log error to Evennia logger
├── Notify character (if exists)
└── Record to event sourcing
Tick Loop Integration
On each tick, emergency stop is checked first (assistant_script.py:587):
if self.db.emergency_stop:
return # Skip entire tick
2. Core Functions
trigger_emergency_stop()
from evennia.contrib.base_systems.ai.emergency import trigger_emergency_stop
trigger_emergency_stop(script, reason)
Actions performed:
- Sets
script.db.emergency_stop = True - Logs via
logger.log_err() - Sends message to character (if attached)
- Records event to eventsourcing
clear_emergency_stop()
from evennia.contrib.base_systems.ai.emergency import clear_emergency_stop
success, message = clear_emergency_stop(script)
Actions performed:
- Sets
script.db.emergency_stop = False - Resets
script.db.consecutive_errors = 0 - Clears
script.db.is_ticking = False(prevents stuck state) - Records clearance event to eventsourcing
Returns:
(True, "Emergency stop cleared...")on success(False, "Emergency stop is not active")if not stopped
3. Error Tracking Integration
Success Resets Counter
When a tool executes successfully (assistant_script.py:759, 805):
if tool_result.get("success"):
self.db.consecutive_errors = 0
else:
self.db.consecutive_errors += 1
Threshold Check
After incrementing (assistant_script.py:810-812):
if self.db.consecutive_errors >= self.db.max_consecutive_errors:
self.trigger_emergency_stop(
f"Maximum consecutive errors reached ({self.db.consecutive_errors})"
)
Critical Exception Handling
Unhandled exceptions also increment the counter (assistant_script.py:820-823):
except Exception:
self.db.consecutive_errors += 1
if self.db.consecutive_errors >= self.db.max_consecutive_errors:
self.trigger_emergency_stop(
f"Critical error in tick loop after {self.db.consecutive_errors} failures"
)
4. Sub-Agent Safety
Delegation checks delegate's emergency stop state (sub_agents.py:330-335):
if delegate_script.db.emergency_stop:
defer.returnValue({
"success": False,
"error": "emergency_stop",
})
5. Recovery
Via Command
assistant emergency clear
This calls clear_emergency_stop() and allows the assistant to resume on next tick.
Via API
POST /api/ai/assistants/{key}/emergency/clear/
Via History Reset
The /history reset command also clears emergency state (commands/history.py:334-335):
script.db.consecutive_errors = 0
script.db.emergency_stop = False
6. State Inspection
Emergency state is included in status queries (assistant_script.py:1582-1584):
{
"emergency_stop": self.db.emergency_stop or False,
"consecutive_errors": self.db.consecutive_errors or 0,
"max_consecutive_errors": self.db.max_consecutive_errors or 5,
}
Key Files
| File | Lines | Purpose |
|---|---|---|
emergency.py |
11-43 | trigger_emergency_stop() |
emergency.py |
45-76 | clear_emergency_stop() |
assistant_script.py |
190-192 | Default thresholds |
assistant_script.py |
587 | Tick skip check |
assistant_script.py |
759-823 | Error tracking and threshold |
assistant_script.py |
1150-1160 | Script method wrappers |
See also: Architecture-Core-Engine | Architecture-Event-Sourcing