Cloud Mode (Claude API)

KULVEX supports hybrid inference: local Mnemo for speed and privacy, Claude API for complex reasoning.

How It Works

The user toggles Cloud mode in the chat UI. When enabled:

Messages are sent to the Anthropic API instead of the local llama-server
Claude has access to the same 17 domain agents via native tool_use
Responses stream back via Socket.IO identically to local mode

Setup

Get an API key from console.anthropic.com
Go to Settings in the KULVEX dashboard
Enter your ANTHROPIC_API_KEY

Or set it in ~/.kulvex/.env:

ANTHROPIC_API_KEY=sk-ant-xxx

When to Use Cloud

Complex reasoning — Multi-step analysis, code review, long-form writing
Tool-heavy tasks — Claude’s native tool_use is more reliable than local models
No GPU — Cloud-only mode when running without a GPU
Comparison — Test local vs cloud on the same query

Privacy

When Cloud mode is off (default), zero data leaves your machine. When Cloud mode is on, your messages are sent to Anthropic’s API. Anthropic does not use API inputs for training.

Cost

Claude API is usage-based. KULVEX shows token counts in the chat UI so you can monitor costs. For most users, local Mnemo handles 90%+ of daily queries at zero marginal cost.

Inference Engine Voice Pipeline