First 100M tokens saved — free

Compress context before your agents read it

Compression only — not a model provider. The install script redirects your tools to Klood; your existing keys still authenticate with Anthropic or OpenAI. Tool outputs, logs, code and files are compressed before the request reaches your provider.

Install now Dashboard Docs

$ curl -fsSL https://kloodproject.com/install.sh | bash

Install sets base URLs in your shell. Open a new terminal, then run claude, codex, or your agent. Proxy not working?

Supported agents

All your tools. One place.

One install, all agents. Each tool keeps its own provider key — Klood only changes where requests go first.

Claude
Cursor
OpenAI Codex
Copilot CLI
Ollama
Aider

Compression pipeline

What gets compressed

Klood sits in the middle: compress context, then pass through to Anthropic or OpenAI — whichever protocol your agent used.

Your agent K Klood Anthropic · OpenAI

SmartCrusher

Tool & JSON outputs

Search results, API responses, test output — structured data compacted before the model reads it.

CodeCompressor

Source code

AST-aware parsing keeps signatures and imports, drops noise from Python, JS, Go, Rust and more.

Log crush

Logs & diffs

Build logs, stack traces, git diff output — debugging context without eating your window.

RAG route

RAG & files

Retrieved chunks and pasted files are routed and compressed before the model ever sees them.

CCR

Reversible compression

Originals stay cached. If the model needs full detail, it retrieves the uncompressed source on demand — nothing is lost, just deferred.

Pricing

Start free. Scale when you need to.

Track compression in your dashboard. No credit card to get started.

Free tier

100M tokens saved

All supported agents
Compression analytics dashboard
Personal API key for usage tracking
Your provider keys stay local

Create free account

After free tier

Pay as you save

Coming soon

Same compression pipeline
Usage-based billing on tokens saved
Team dashboards (planned)

Get notified

Why Klood

Built for daily agent work

✓

One curl install

macOS, Linux, Windows — routes all your agents to our API. No infra keys.

✓

Your keys stay yours

Use your Anthropic and OpenAI accounts. We compress and forward — we don't resell models.

✓

Always-on hosted API

Managed, redundant infrastructure — scaled for agent traffic. Nothing to run on your laptop.

✓

Benchmark-backed

60–95% token reduction on real agent workloads with minimal accuracy loss on standard evals.

Benchmarks

Measured token savings

Published eval results — pick a category below.

Compression Accuracy Tool JSON HTML

Typical agent payloads — tool JSON, shell output, build logs. Measured on Apple M-series CPU.

Suite tokens

23,921→8,110

Tokens saved 15,811

Overall reduction 66.1%

Tool JSON

100 log entries · SmartCrusher

90.6%

3,163→297

1ms

Tool JSON

500 log entries · SmartCrusher

83.1%

9,526→1,614

2ms

Shell output

200 lines · log crush

85.5%

3,238→469

1ms

Build log

200 lines · log crush

93.9%

2,412→148

1ms

grep results

150 hits · already minimal

2,624→2,624

<1ms

Python source

~480 lines · AST preserved

2,958→2,958

<1ms

gpt-4o-mini N = 100 Eval tier 1

Standard accuracy

lm-eval harness · baseline vs compressed

GSM8K · Math

Baseline0.870

Compressed0.870

Δ0.000

TruthfulQA · Factual

Baseline0.530

Compressed0.560

Δ+0.030

With compression

SQuAD v2 QA

97% accuracy preserved 19% fewer tokens

Before/After

BFCL Tools

97% accuracy preserved 32% fewer tokens

LLM-as-Judge

100 production log entries — locate error, code, resolution, and affected count.

Input tokens

10,144→1,260

Correct answers

4/4→4/4

Token reduction 87.6%

Scrapinghub Article Extraction · 181 HTML pages

F1 score

0.958→0.919

Recall

100%→98.2%

Token reduction 94.9%

Questions & answers

Common questions

A hosted compression proxy — not an LLM API. Your agent sends requests to api.kloodproject.com instead of api.anthropic.com or api.openai.com. We shrink tool outputs, logs, code and files, then forward the request to the real provider using your API key.

Because Klood does not sell tokens or host models. You pay Anthropic for Claude, OpenAI for GPT, etc. — exactly as before. The install script only redirects *_BASE_URL to Klood; your keys are forwarded unchanged to the matching provider after compression.

By API protocol, not by you picking a provider in Klood. Claude Code hits /v1/messages (Anthropic wire) → forwarded to Anthropic. Codex and GPT in Cursor hit /v1/chat/completions (OpenAI wire) → forwarded to OpenAI. The installer sets both base URLs so every tool finds Klood; our server picks the upstream from the request shape.

No. End users only run the install command — then your normal Anthropic or OpenAI keys, same as before.

macOS, Linux (Ubuntu, Debian, etc.) and Windows. The installer sets environment variables for your shell or user profile. Cursor needs one paste in Settings.

Klood is measured on GSM8K, TruthfulQA, SQuAD and tool-use tasks with minimal accuracy delta. Reversible compression (CCR) means the model can fetch full originals when it needs them — so you don't lose information, you defer it. See the benchmark table for token numbers.

Tool return values, JSON blobs, stack traces, build logs, retrieved RAG chunks, pasted files and long diffs — the noisy context agents accumulate. Your prompts and API keys are handled normally; the savings come from slimming what the model has to read.

Your key stays on your machine and is sent to our API over HTTPS with each request — same as calling Anthropic or OpenAI directly. We don't train models on your data.

Route your agents through Klood

One install command. Open a new terminal. Use Claude, Cursor, or Codex with compressed context.

Install now