2 User Guide 04 Troubleshooting
Blightbow edited this page 2025-12-09 04:34:35 -05:00

Troubleshooting Guide

Common problems and solutions for the AI Assistant system.


Quick Diagnostics

Before diving into specific issues, run these commands to gather diagnostic information:

> aisetup mybot              # Check overall status
> aisetup/config mybot       # Verify configuration
> aihistory/execution mybot  # View recent tool executions

1. Assistant Not Responding

Symptom

You send messages to the assistant channel but get no response.

Diagnostic Checklist

1. Is the assistant running?

> aisetup mybot

Look for enabled: True and emergency_stop: False.

2. Is the LLM configured?

> aisetup/config mybot

Verify:

  • llm_provider is set (e.g., openrouter, openai, anthropic)
  • llm_auth_token is set (will show as [REDACTED])
  • llm_model is set and valid

3. Check the tick rate The assistant only processes messages on tick intervals. With a 5-second tick rate, you might wait up to 5 seconds for a response.

Solutions

Problem Solution
Not running aisetup/start mybot
Emergency stopped aisetup/reset mybot then aisetup/start mybot
No LLM config See Getting Started Step 4
Wrong model name Check provider docs for valid model names

2. Emergency Stop Triggered

Symptom

The assistant stops responding and status shows emergency_stop: True.

What Triggers It

Emergency stop activates after 5 consecutive failures (configurable via max_consecutive_errors). Failures include:

  • LLM API errors (authentication, rate limits, server errors)
  • Tool execution failures
  • Unhandled exceptions in the tick loop

How to Recover

Step 1: Clear the emergency stop

> aisetup/reset mybot

Step 2: Check recent errors

> aihistory/execution mybot=10

Look for patterns in the failures.

Step 3: Restart

> aisetup/start mybot

Preventing Recurrence

Cause Prevention
Rate limits Increase tick rate (aisetup/config mybot set tick_rate 10)
Invalid API key Verify key is correct and has sufficient credits
Model not found Check model name matches provider format
Persistent tool failures Review tool permissions and game state

Adjusting the Threshold

If the assistant triggers emergency stop too easily:

> aisetup/config mybot set max_consecutive_errors 10

3. LLM Provider Errors

Authentication Failures (401, 403)

Symptoms:

  • "Unauthorized" or "Forbidden" errors in execution log
  • No responses from assistant

Solutions:

Provider Check
OpenRouter Verify key starts with sk-or-v1- and has credits
OpenAI Verify key starts with sk- and has credits
Anthropic Verify key format and API access enabled

Reset the token:

> aisetup/config mybot set llm_auth_token sk-NEW_KEY_HERE

Rate Limiting (429)

Symptoms:

  • Intermittent responses
  • "Rate limit exceeded" in logs
  • Works briefly then stops

Solutions:

  1. Increase tick rate (fewer requests per minute):

    > aisetup/config mybot set tick_rate 10
    
  2. Use a different model (some have higher limits):

    > aisetup/config mybot set llm_model openai/gpt-4o-mini
    
  3. Upgrade API tier (provider-specific)

Model Not Found (404)

Symptoms:

  • "Model not found" errors
  • No responses

Solutions:

OpenRouter models require provider prefix:

# Wrong
> aisetup/config mybot set llm_model gpt-4o-mini

# Correct
> aisetup/config mybot set llm_model openai/gpt-4o-mini

Common OpenRouter model names:

Provider Model Names
OpenAI openai/gpt-4o, openai/gpt-4o-mini
Anthropic anthropic/claude-3.5-sonnet, anthropic/claude-3-haiku
Google google/gemini-1.5-flash, google/gemini-1.5-pro

Server Errors (500, 502, 503)

Symptoms:

  • Intermittent failures
  • "Internal server error" messages

Solutions:

These are usually temporary provider issues:

  1. Wait a few minutes
  2. Check provider status page
  3. The assistant will auto-retry with exponential backoff

4. Tool Execution Failures

Permission Errors

Symptoms:

  • "Permission denied" in tool results
  • "Lock check failed" messages

Causes:

  • AssistantCharacter lacks required permissions
  • Target object has restrictive locks

Solutions:

Check AssistantCharacter permissions:

> examine mybot  # The character, not the script

The character should have Developer and Admin permissions by default. If not:

> perm mybot = Admin
> perm mybot = Developer

Validation Errors

Symptoms:

  • "Invalid parameter" messages
  • Tool returns success: false with validation details

Causes:

  • LLM generated invalid tool arguments
  • Required parameters missing

Solutions:

These usually resolve on the next tick as the LLM receives the error feedback. If persistent:

  1. Check aihistory/execution mybot for patterns
  2. Consider adjusting the system prompt to provide clearer guidance
  3. Try a more capable model

Timeout Errors

Symptoms:

  • "Tool execution timed out"
  • Slow responses

Default timeout: 30 seconds per tool

Solutions:

Increase the timeout:

> aisetup/config mybot set default_tool_timeout 60

For consistently slow operations, consider:

  • Breaking large tasks into smaller steps
  • Using async tools where available

5. Memory and Context Issues

"Context too long" / Compaction Messages

Symptoms:

  • Assistant mentions "compacting memory"
  • Responses become inconsistent

What's Happening: The conversation history exceeded the token budget. The system automatically:

  1. Summarizes older messages
  2. Preserves recent conversation
  3. Continues with reduced history

This is normal behavior. The assistant manages its own context.

Adjustments:

Increase history limit (uses more tokens):

> aisetup/config mybot set max_history 100

Manually clear history if it's cluttered:

> aihistory/clear/conversation mybot

Entity Profiles Not Persisting

Symptoms:

  • Assistant doesn't remember information about players/NPCs
  • Relationship states reset

Causes:

  • Entity profiles are stored on character.db.entity_profiles
  • If the AssistantCharacter is deleted/recreated, profiles are lost

Solutions:

Entity profiles persist across server restarts. If lost:

  1. Check the character wasn't recreated: examine mybot
  2. Entity consolidation happens during sleep mode
  3. Enable sleep mode for automatic consolidation

Semantic Memory Not Available

Symptoms:

  • "Memory client not initialized" errors
  • aimemory command fails

Cause: Semantic memory requires the optional mem0ai package.

Solutions:

  1. Install the package:

    pip install mem0ai
    
  2. Enable memory:

    > aisetup/config mybot set memory_enabled true
    
  3. Restart the assistant:

    > aisetup/stop mybot
    > aisetup/start mybot
    

6. Sleep Mode Problems

Assistant Won't Sleep

Symptoms:

  • Sleep schedule configured but assistant never sleeps
  • operating_mode stays awake

Causes:

  • Sleep schedule not enabled
  • Current time outside sleep window
  • Minimum awake ticks not reached

Diagnostic:

> aisetup mybot

Check sleep_schedule configuration.

Solutions:

Enable sleep schedule:

# Via code or manually set these values
script.db.sleep_schedule = {
    "enabled": True,
    "sleep_start_hour": 2,      # 2:00 AM server time
    "sleep_duration_hours": 4,   # Until 6:00 AM
    "min_awake_ticks": 10        # Stay awake at least 10 ticks
}

Assistant Won't Wake Up

Symptoms:

  • operating_mode: sleep
  • sleep_phase: compacting for extended period

Cause: During compacting phase, wake requests are blocked. The assistant must complete memory consolidation first.

Solutions:

Wait for compaction to complete (usually a few ticks), then force wake:

> aisetup/config mybot set operating_mode awake

Consolidation Errors

Symptoms:

  • Sleep mode fails with errors
  • "Memory consolidation failed"

Causes:

  • LLM API errors during consolidation
  • Insufficient token budget

Solutions:

  1. Check LLM configuration
  2. Review execution log for specific errors
  3. Clear and retry:
    > aisetup/config mybot set operating_mode awake
    > aisetup/stop mybot
    > aisetup/start mybot
    

7. RAG / Semantic Search Issues

RAG Not Available

Symptoms:

  • "RAG client not initialized"
  • Journal search returns empty results

Causes:

  • Missing qdrant-client package
  • Qdrant server not running
  • RAG not enabled

Solutions:

  1. Install dependencies:

    pip install qdrant-client[fastembed]
    
  2. Start Qdrant:

    docker run -d --name qdrant -p 6333:6333 qdrant/qdrant:latest
    
  3. Enable RAG:

    > aisetup/rag mybot enable
    > aisetup/rag mybot config set qdrant_host localhost
    

Embedding Errors

Symptoms:

  • "Embedding generation failed"
  • RAG queries return nothing

Cause: Embedding provider misconfigured or unavailable.

Solutions:

Check embedding provider (auto-detected from LLM provider):

LLM Provider Default Embedding
openai OpenAI API
anthropic FastEmbed (local)
openrouter FastEmbed (local)
ollama Ollama

For explicit control:

> aisetup/rag mybot config set rag_embedding_provider fastembed

FastEmbed requires no API and runs locally (recommended for testing).


8. Performance Issues

Slow Responses

Possible Causes:

  1. Tick rate too high (not slow enough for LLM calls)
  2. Model is slow (e.g., large models, busy providers)
  3. Network latency

Solutions:

Use a faster model:

> aisetup/config mybot set llm_model openai/gpt-4o-mini

Increase tick rate (more time between checks):

> aisetup/config mybot set tick_rate 10

High Token Usage

Symptoms:

  • API costs higher than expected
  • Frequent context compaction

Solutions:

  1. Reduce history:

    > aisetup/config mybot set max_history 30
    
  2. Use cheaper models for simple tasks

  3. Review conversation patterns - shorter exchanges use fewer tokens


9. Getting Help

Information to Gather

When seeking help, gather:

  1. aisetup mybot output
  2. aisetup/config mybot output
  3. Recent errors from aihistory/execution mybot=20
  4. Evennia version (evennia version)
  5. Python version (python3 --version)

Debug Mode

For detailed logging, enable debug in your Evennia settings:

# mygame/server/conf/settings.py
DEBUG = True

Check logs in mygame/server/logs/.

Resetting to Known State

If all else fails, reset the assistant completely:

> aisetup/stop mybot
> aihistory/clear mybot
> aisetup/reset mybot
> aisetup/start mybot

Or delete and recreate:

> aisetup/delete mybot
> aisetup/init mybot
# Reconfigure LLM settings...
> aisetup/start mybot

Error Reference

Error Categories

Category Retry? Examples
AUTH No 401 Unauthorized, 403 Forbidden
TRANSIENT Yes 429 Rate Limit, 500/502/503/504 Server Errors, Timeouts
BUSINESS_LOGIC No 400 Bad Request, 404 Not Found, Validation Errors

HTTP Status Codes

Code Meaning Action
401 Invalid API key Check/reset llm_auth_token
403 Access denied Check API permissions/tier
404 Model not found Verify model name format
429 Rate limited Wait, increase tick rate, or upgrade tier
500 Server error Wait and retry (automatic)
502/503/504 Service unavailable Wait and retry (automatic)

See also: Architecture-Safety-System | Architecture-Resilience-System


Last updated: 2025-12-09