1 Architecture Safety System
blightbow edited this page 2025-12-08 04:50:35 +00:00

Architecture: Safety System

Infrastructure - Emergency Stop and Error Recovery


Overview

The safety system provides automatic halting when the assistant encounters sustained failures:

  • Emergency stop - Halts all operations until manually cleared
  • Consecutive error tracking - Triggers stop after threshold exceeded
  • Recovery mechanisms - Manual clear via command or API

1. Emergency Stop Mechanism

Trigger Conditions

Emergency stop activates when consecutive errors reach the threshold (emergency.py, assistant_script.py):

# Default configuration (assistant_script.py:190-192)
self.db.emergency_stop = False
self.db.max_consecutive_errors = 5
self.db.consecutive_errors = 0

Trigger Flow

Tick Execution
    │
    ▼
Tool/LLM call fails
    │
    ▼
consecutive_errors += 1
    │
    ▼
Check: consecutive_errors >= max_consecutive_errors?
    │
    ├── NO: Continue tick loop
    │
    └── YES: trigger_emergency_stop()
              │
              ├── Set db.emergency_stop = True
              ├── Log error to Evennia logger
              ├── Notify character (if exists)
              └── Record to event sourcing

Tick Loop Integration

On each tick, emergency stop is checked first (assistant_script.py:587):

if self.db.emergency_stop:
    return  # Skip entire tick

2. Core Functions

trigger_emergency_stop()

from evennia.contrib.base_systems.ai.emergency import trigger_emergency_stop

trigger_emergency_stop(script, reason)

Actions performed:

  1. Sets script.db.emergency_stop = True
  2. Logs via logger.log_err()
  3. Sends message to character (if attached)
  4. Records event to eventsourcing

clear_emergency_stop()

from evennia.contrib.base_systems.ai.emergency import clear_emergency_stop

success, message = clear_emergency_stop(script)

Actions performed:

  1. Sets script.db.emergency_stop = False
  2. Resets script.db.consecutive_errors = 0
  3. Clears script.db.is_ticking = False (prevents stuck state)
  4. Records clearance event to eventsourcing

Returns:

  • (True, "Emergency stop cleared...") on success
  • (False, "Emergency stop is not active") if not stopped

3. Error Tracking Integration

Success Resets Counter

When a tool executes successfully (assistant_script.py:759, 805):

if tool_result.get("success"):
    self.db.consecutive_errors = 0
else:
    self.db.consecutive_errors += 1

Threshold Check

After incrementing (assistant_script.py:810-812):

if self.db.consecutive_errors >= self.db.max_consecutive_errors:
    self.trigger_emergency_stop(
        f"Maximum consecutive errors reached ({self.db.consecutive_errors})"
    )

Critical Exception Handling

Unhandled exceptions also increment the counter (assistant_script.py:820-823):

except Exception:
    self.db.consecutive_errors += 1
    if self.db.consecutive_errors >= self.db.max_consecutive_errors:
        self.trigger_emergency_stop(
            f"Critical error in tick loop after {self.db.consecutive_errors} failures"
        )

4. Sub-Agent Safety

Delegation checks delegate's emergency stop state (sub_agents.py:330-335):

if delegate_script.db.emergency_stop:
    defer.returnValue({
        "success": False,
        "error": "emergency_stop",
    })

5. Recovery

Via Command

assistant emergency clear

This calls clear_emergency_stop() and allows the assistant to resume on next tick.

Via API

POST /api/ai/assistants/{key}/emergency/clear/

Via History Reset

The /history reset command also clears emergency state (commands/history.py:334-335):

script.db.consecutive_errors = 0
script.db.emergency_stop = False

6. State Inspection

Emergency state is included in status queries (assistant_script.py:1582-1584):

{
    "emergency_stop": self.db.emergency_stop or False,
    "consecutive_errors": self.db.consecutive_errors or 0,
    "max_consecutive_errors": self.db.max_consecutive_errors or 5,
}

Key Files

File Lines Purpose
emergency.py 11-43 trigger_emergency_stop()
emergency.py 45-76 clear_emergency_stop()
assistant_script.py 190-192 Default thresholds
assistant_script.py 587 Tick skip check
assistant_script.py 759-823 Error tracking and threshold
assistant_script.py 1150-1160 Script method wrappers

See also: Architecture-Core-Engine | Architecture-Event-Sourcing