Getting Started
SCRUBR is a single binary. You give it a config that says which upstreams to proxy, which JSON paths to scan, and what to detect — then point your app at it.
1. Install
Grab a binary from the releases (Linux/macOS/Windows, x86_64 + arm64), or run the container:
docker run --rm -p 8080:8080 -v "$PWD/scrubr.yaml:/etc/scrubr/scrubr.yaml:ro" \
ghcr.io/scrubr-dev/scrubr:latest --config /etc/scrubr/scrubr.yaml --listen 0.0.0.0:8080
Or build from source: cargo build --release -p scrubr.
2. Write a config
# scrubr.yaml
routes:
- { listen_path: "/openai", upstream: "https://api.openai.com", profile: openai }
profiles:
openai:
scan_paths: ["messages[].content"]
stream_paths: ["choices[].delta.content"] # required for streaming responses
rules:
- { name: email, type: EMAIL, pattern: '[\w.+-]+@[\w.-]+\.\w+', priority: 50 }
masking:
mode: dry-run # detect & report first; switch to enforce when you trust coverage
3. Run it
scrubr --config scrubr.yaml --listen 127.0.0.1:8080
Point your app's base URL at http://127.0.0.1:8080/openai instead of
https://api.openai.com. That's it — requests are scanned, the upstream sees masked
content, and the streamed response comes back rehydrated.
4. Validate, then enforce
Run in dry-run first: SCRUBR detects and reports (via the x-scrubr-detected
response header and the audit log) but forwards the original,
so you can confirm coverage without risk. When you're happy, set masking.mode: enforce.
scrubr demo # offline: mask → streamed echo → rehydrate, no network
The mental model
sequenceDiagram
participant App as Your app
participant S as SCRUBR
participant P as LLM provider
App->>S: request (contains secrets / PII)
Note over S: detect → mask ⟦S:TYPE·id·tag⟧
S->>P: masked request
P-->>S: streamed response (placeholders)
Note over S: rehydrate originals
S-->>App: rehydrated response
- Detect secrets/PII on the configured request paths.
- Mask each hit with a reversible sentinel; the real value is kept only inside SCRUBR.
- Forward the masked request upstream.
- Rehydrate the response stream, splicing originals back — correct even when a sentinel is split across SSE token events.
- Wipe the per-request map when the response ends.
Where to next
- Why SCRUBR? — the problem, the benefits, and the trade-offs.
- Detection & masking rules — write rules, use the curated ruleset.
- Use as an HTTP proxy — set SCRUBR as your OS proxy (no base-URL change).
- Configuration reference — every config key.