Getting Started

SCRUBR is a single binary. You give it a config that says which upstreams to proxy, which JSON paths to scan, and what to detect — then point your app at it.

1. Install

Grab a binary from the releases (Linux/macOS/Windows, x86_64 + arm64), or run the container:

docker run --rm -p 8080:8080 -v "$PWD/scrubr.yaml:/etc/scrubr/scrubr.yaml:ro" \
  ghcr.io/scrubr-dev/scrubr:latest --config /etc/scrubr/scrubr.yaml --listen 0.0.0.0:8080

Or build from source: cargo build --release -p scrubr.

2. Write a config

# scrubr.yaml
routes:
  - { listen_path: "/openai", upstream: "https://api.openai.com", profile: openai }
profiles:
  openai:
    scan_paths:   ["messages[].content"]
    stream_paths: ["choices[].delta.content"]   # required for streaming responses
rules:
  - { name: email, type: EMAIL, pattern: '[\w.+-]+@[\w.-]+\.\w+', priority: 50 }
masking:
  mode: dry-run     # detect & report first; switch to enforce when you trust coverage

3. Run it

scrubr --config scrubr.yaml --listen 127.0.0.1:8080

Point your app's base URL at http://127.0.0.1:8080/openai instead of https://api.openai.com. That's it — requests are scanned, the upstream sees masked content, and the streamed response comes back rehydrated.

4. Validate, then enforce

Run in dry-run first: SCRUBR detects and reports (via the x-scrubr-detected response header and the audit log) but forwards the original, so you can confirm coverage without risk. When you're happy, set masking.mode: enforce.

scrubr demo            # offline: mask → streamed echo → rehydrate, no network

The mental model

sequenceDiagram
    participant App as Your app
    participant S as SCRUBR
    participant P as LLM provider
    App->>S: request (contains secrets / PII)
    Note over S: detect → mask ⟦S:TYPE·id·tag⟧
    S->>P: masked request
    P-->>S: streamed response (placeholders)
    Note over S: rehydrate originals
    S-->>App: rehydrated response
  • Detect secrets/PII on the configured request paths.
  • Mask each hit with a reversible sentinel; the real value is kept only inside SCRUBR.
  • Forward the masked request upstream.
  • Rehydrate the response stream, splicing originals back — correct even when a sentinel is split across SSE token events.
  • Wipe the per-request map when the response ends.

Where to next