Sensitive Content Replacement, Unmasking, Brokering & Rehydration

Send prompts to any LLM
without sending your secrets.

Keep secrets out of LLM traffic—without losing context. SCRUBR is a single-binary forward proxy that masks secrets, PII, and sensitive data on the way out to an LLM provider — and rehydrates the response on the way back, including mid-stream. The provider sees placeholders; your users get the real thing.

your appSCRUBRLLM API original → ⟦S:EMAIL·3⟧ → rehydrated

The wedge is compliance, not routing.

LLM features ship by sending user text to a third party. That text contains API keys, customer PII, internal hostnames, tokens in stack traces — things you're contractually obligated not to share. Destructive redaction (***) wrecks answer quality; trusting the provider isn't an option under SOC 2 / PCI-DSS / HIPAA / GDPR.

SCRUBR gives you a lossless, reversible de-identification round trip instead.

Why SCRUBR & the trade-offs

Everything you need, on the wire

A payload-owning proxy — not an SDK, not a model change.

Reversible masking

Secrets become typed placeholders ⟦S:TYPE·id⟧ — never ***. The model reasons over a stable token; the real value never leaves SCRUBR.

Streaming-correct

Lossless rehydration at every byte boundary, reassembling a sentinel even when the provider splits it across SSE token events.

Provider-aware

Mask only the content paths you name (messages[].content) — never model or metadata, so the API contract stays intact.

Detection that scales

Glossary + a single-pass regex meta-engine + entropy + heuristic NER, fed by .env / file / Vault secret sources. Cost ~flat in rule count.

Sessions & tenancy

Per-request or per-session pseudonyms (memory or Redis, encrypted at rest), with multi-tenant policy and isolated namespaces.

HTTP proxy or reverse proxy

Drop in by changing a base URL, or run it as your OS HTTPS proxy via on-the-fly per-host certificates.

Provable auditing

Tamper-evident hash-chained audit log (counts/types, never values) plus an optional full transaction log of the masked exchange.

Small & fast

One static binary — rustls + ring, no OpenSSL — multi-arch container, hot-reloaded config.

How it works

Five steps, every request. The secret never reaches the provider.

  1. 1
    Detect

    Scan the configured request paths for secrets & PII.

  2. 2
    Mask

    Replace each hit with a reversible sentinel; keep the original only inside SCRUBR.

  3. 3
    Forward

    Send the masked request to the real provider.

  4. 4
    Rehydrate

    Splice originals back into the response stream as it flows.

  5. 5
    Wipe

    Zeroize the per-request map when the response ends.

scrubr.yaml
# scrubr.yaml
routes:
  - { listen_path: "/openai", upstream: "https://api.openai.com", profile: openai }
profiles:
  openai:
    scan_paths:   ["messages[].content"]
    stream_paths: ["choices[].delta.content"]
rules:
  - { name: email, type: EMAIL, pattern: '[\w.+-]+@[\w.-]+\.\w+', priority: 50 }
shell
$ scrubr --config scrubr.yaml --listen 127.0.0.1:8080

Point your app at http://127.0.0.1:8080/openai. Start in dry-run, then enforce.

Use it your way

Reverse proxy

Change a base URL. No CA, no SDK changes. The simplest, safest start.

Getting Started

OS HTTP proxy

Set SCRUBR as your HTTPS_PROXY; it MITMs configured hosts with minted certs and tunnels the rest untouched.

HTTP Proxy guide

Container

Multi-arch image on GHCR, published every release.

Deployment

Read the docs, run the binary.

Everything is in the open — configuration reference, threat model, and the full design.