Tool & JSON outputs
Search results, API responses, test output — structured data compacted before the model reads it.
Compression only — not a model provider. The install script redirects your tools to Klood; your existing keys still authenticate with Anthropic or OpenAI. Tool outputs, logs, code and files are compressed before the request reaches your provider.
Install sets base URLs in your shell. Open a new terminal, then run claude, codex, or your agent. Proxy not working?
Supported agents
One install, all agents. Each tool keeps its own provider key — Klood only changes where requests go first.
Compression pipeline
Klood sits in the middle: compress context, then pass through to Anthropic or OpenAI — whichever protocol your agent used.
Search results, API responses, test output — structured data compacted before the model reads it.
AST-aware parsing keeps signatures and imports, drops noise from Python, JS, Go, Rust and more.
Build logs, stack traces, git diff output — debugging context without eating your window.
Retrieved chunks and pasted files are routed and compressed before the model ever sees them.
Originals stay cached. If the model needs full detail, it retrieves the uncompressed source on demand — nothing is lost, just deferred.
Pricing
Track compression in your dashboard. No credit card to get started.
$0
Coming soon
Why Klood
macOS, Linux, Windows — routes all your agents to our API. No infra keys.
Use your Anthropic and OpenAI accounts. We compress and forward — we don't resell models.
Managed, redundant infrastructure — scaled for agent traffic. Nothing to run on your laptop.
60–95% token reduction on real agent workloads with minimal accuracy loss on standard evals.
Benchmarks
Published eval results — pick a category below.
Typical agent payloads — tool JSON, shell output, build logs. Measured on Apple M-series CPU.
lm-eval harness · baseline vs compressed
97% accuracy preserved 19% fewer tokens
Before/After97% accuracy preserved 32% fewer tokens
LLM-as-Judge100 production log entries — locate error, code, resolution, and affected count.
Scrapinghub Article Extraction · 181 HTML pages
Questions & answers
A hosted compression proxy — not an LLM API. Your agent sends requests to api.kloodproject.com instead of api.anthropic.com or api.openai.com. We shrink tool outputs, logs, code and files, then forward the request to the real provider using your API key.
Because Klood does not sell tokens or host models. You pay Anthropic for Claude, OpenAI for GPT, etc. — exactly as before. The install script only redirects *_BASE_URL to Klood; your keys are forwarded unchanged to the matching provider after compression.
By API protocol, not by you picking a provider in Klood. Claude Code hits /v1/messages (Anthropic wire) → forwarded to Anthropic. Codex and GPT in Cursor hit /v1/chat/completions (OpenAI wire) → forwarded to OpenAI. The installer sets both base URLs so every tool finds Klood; our server picks the upstream from the request shape.
No. End users only run the install command — then your normal Anthropic or OpenAI keys, same as before.
macOS, Linux (Ubuntu, Debian, etc.) and Windows. The installer sets environment variables for your shell or user profile. Cursor needs one paste in Settings.
Klood is measured on GSM8K, TruthfulQA, SQuAD and tool-use tasks with minimal accuracy delta. Reversible compression (CCR) means the model can fetch full originals when it needs them — so you don't lose information, you defer it. See the benchmark table for token numbers.
Tool return values, JSON blobs, stack traces, build logs, retrieved RAG chunks, pasted files and long diffs — the noisy context agents accumulate. Your prompts and API keys are handled normally; the savings come from slimming what the model has to read.
Your key stays on your machine and is sent to our API over HTTPS with each request — same as calling Anthropic or OpenAI directly. We don't train models on your data.
One install command. Open a new terminal. Use Claude, Cursor, or Codex with compressed context.
Install now