Skip to main content

Admin

Administrative endpoints for system-level AI configuration and maintenance.

POST /api/_ai-refresh-cache

Force refresh of AI configuration cache.

Authentication: permission.ai.manage

Request Body: (none)

Response:

{
"success": true,
"cacheRefreshed": true,
"timestamp": "2024-01-15T14:45:00Z",
"componentsRefreshed": [
"models",
"providers",
"plans",
"entitlements"
]
}

Use Cases:

  • After manual database changes
  • Troubleshooting cache inconsistencies
  • Verifying configuration updates
  • Testing new model deployments

Side Effects:

  • Clears all AI-related cache entries
  • Forces reload from database
  • May cause brief performance impact during reload
  • All users get fresh configuration on next request
Cache Management

The AI configuration cache is automatically refreshed every 15 minutes. Manual refresh is typically only needed for debugging or immediate configuration changes.


Cache Architecture

Cached Components

The AI system caches several components for performance:

  1. Models - Model definitions, capabilities, pricing
  2. Providers - Provider configurations (credentials excluded)
  3. Plans - AI plan definitions and seat allocations
  4. Entitlements - Tenant entitlements from License Manager
  5. Team Settings - Team-specific AI configurations

Cache Invalidation

Caches are automatically invalidated when:

  • Model created/updated/deleted
  • Provider created/updated/deleted
  • Plan modified
  • Seat assigned/unassigned
  • Team settings changed
  • Manual refresh requested

Cache TTL

Default time-to-live for cache entries:

  • Models: 15 minutes
  • Providers: 15 minutes
  • Plans: 5 minutes
  • Entitlements: 15 minutes
  • Team Settings: 10 minutes

Cache Keys

Cache keys follow this pattern:

ai:tenant:{tenantId}:models
ai:tenant:{tenantId}:providers
ai:tenant:{tenantId}:plans
ai:tenant:{tenantId}:entitlements
ai:tenant:{tenantId}:team:{teamId}:settings
ai:tenant:{tenantId}:seat:{username}

Configuration Validation

Model Validation

When creating or updating models, the system validates:

  • Provider exists and is accessible
  • Pricing is non-negative
  • Context window is positive
  • Max output tokens ≤ context window
  • Tier is valid (everyday, advanced, strategic)
  • Alias models reference valid canonical models
  • No circular references in alias chains

Provider Validation

When creating or updating providers:

  • Type is supported (see /api/provider-types)
  • Credentials are valid for provider type
  • Enterprise credentials exist for provider
  • No duplicate providers for same type

Plan Validation

When syncing plans from License Manager:

  • Weekly budget ≥ session budget
  • Both budgets are positive
  • Plan tier matches available model tiers
  • Billing cycle is valid (monthly, annual)

Seat Validation

When assigning or modifying seats:

  • User exists and is enabled
  • Plan exists and has capacity
  • User doesn't already have a seat (unless upgrading from free)
  • Reassignment rules are enforced for purchasable seats
  • Team budget caps are not exceeded

System Health Checks

AI System Status

Check if AI features are operational:

curl /api/_health/ai

Response:

{
"status": "healthy",
"components": {
"models": {
"status": "healthy",
"count": 12,
"chatCapable": 8
},
"providers": {
"status": "healthy",
"count": 3,
"connected": 3
},
"seats": {
"status": "healthy",
"total": 35,
"assigned": 28,
"utilizationPercent": 80
},
"cache": {
"status": "healthy",
"lastRefresh": "2024-01-15T14:30:00Z"
}
},
"warnings": [],
"errors": []
}

Status Values:

  • healthy - All systems operational
  • degraded - Some components unavailable
  • unhealthy - Critical failures detected

Component Checks

Individual component health:

  • Models: At least one chat-capable model exists
  • Providers: All providers can authenticate
  • Seats: Seat assignments are consistent
  • Cache: Cache refresh is recent (less than 30 minutes)

Common Warnings

  • low_seat_availability - Few unassigned seats remaining
  • cache_stale - Cache hasn't refreshed recently
  • budget_depleted - Multiple users at budget limits
  • model_deprecated - Deprecated models still in use

Common Errors

  • provider_unavailable - Provider API unreachable
  • model_misconfigured - Model validation failed
  • license_sync_failed - Cannot sync with License Manager
  • cache_failure - Redis cache unavailable

Troubleshooting Guide

Issue: Users cannot access AI features

Diagnosis:

  1. Check user has active AI seat: GET /api/ai-seats?username={user}
  2. Verify entitlements synced: GET /api/ai-entitlements
  3. Check team settings: GET /api/ai-team-settings/{teamId}
  4. Review budget status: GET /api/ai-budget

Resolution:

  • Assign seat if missing
  • Refresh entitlements if stale
  • Grant boost budget if depleted
  • Verify team AI is enabled

Issue: High costs or usage

Diagnosis:

  1. Review usage by plan: GET /api/ai-usage
  2. Check top users: Review topUsers in usage response
  3. Identify expensive models: Review byModel breakdown
  4. Check for boost budget: May bypass normal limits

Resolution:

  • Reduce seat budgets for high users
  • Restrict model tier access
  • Remove boost budgets
  • Educate users on cost-effective models

Issue: Model errors or failures

Diagnosis:

  1. Check model configuration: GET /api/models/\{id\}
  2. Verify provider status: GET /api/providers/\{providerId\}
  3. Test provider connection: Try POST /api/models/\{id\}/_chat
  4. Review provider credentials

Resolution:

  • Update provider credentials
  • Switch to alternate model
  • Contact provider support
  • Check provider API status page

Issue: License Manager sync failures

Diagnosis:

  1. Check entitlement refresh: POST /api/ai-entitlements/_refresh
  2. Review sync logs: Server logs for sync errors
  3. Verify License Manager connectivity
  4. Check API credentials

Resolution:

  • Refresh entitlements manually
  • Verify LM API credentials
  • Contact License Manager support
  • Check network connectivity

Maintenance Tasks

Regular Maintenance

Recommended maintenance schedule:

Daily:

  • Review usage dashboard for anomalies
  • Check budget consumption trends
  • Monitor seat utilization

Weekly:

  • Review top users and costs
  • Audit seat assignments
  • Check for stale team settings

Monthly:

  • Analyze cost trends
  • Review model deprecations
  • Update provider credentials if needed
  • Audit user access

Cleanup Tasks

Periodic cleanup to maintain performance:

Stale Chats:

DELETE FROM chats WHERE updated_at < NOW() - INTERVAL '90 days' AND message_count = 0;

Old Memories:

DELETE FROM memories WHERE created_at < NOW() - INTERVAL '1 year';

Expired Boosts:

UPDATE ai_seats SET boost_budget = 0, boost_used = 0, boost_expires_at = NULL
WHERE boost_expires_at < NOW();

Configuration Backup

Backup critical AI configuration:

# Export models
curl /api/models > models-backup.json

# Export providers (credentials excluded)
curl /api/providers > providers-backup.json

# Export plans
curl /api/ai-plans > plans-backup.json

Performance Optimization

Query Optimization

For high-traffic tenants:

  • Enable query result caching
  • Use pagination for large lists
  • Limit memory search results
  • Batch seat operations

Model Selection

Optimize costs without sacrificing quality:

  • Use everyday tier for simple queries
  • Reserve strategic tier for complex analysis
  • Enable prompt caching for repeated contexts
  • Batch similar requests

Budget Tuning

Balance costs and user experience:

  • Set session budgets to 20-25% of weekly
  • Grant boost sparingly for special projects
  • Monitor and adjust based on actual usage
  • Educate users on cost-effective patterns
Monitoring

Set up alerts for unusual cost spikes, budget depletion, or seat exhaustion to proactively manage AI resources.