-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
The new main sensitive-data engine dropped the old bundled provider-key / private-key detector coverage from master.
master used src/gitleaks_rules.rs plus gitleaks.toml to detect provider-specific credentials and PEM/private-key material before placeholdering. origin/main replaces that with src/sensitive.rs, but the new engine currently only has first-class kinds for OpaqueToken, Password, Email, Phone, NationalId, Passport, PaymentCard, and Cvv, and its generic token path is just high-entropy detection.
That means classic structured-but-not-necessarily-high-entropy secrets from the old runtime are no longer first-class matches in the new codebase.
Evidence
- Old runtime:
src/gitleaks_rules.rsgitleaks.toml- Examples already covered there include AWS access key IDs and PEM private-key blocks.
- New runtime:
src/sensitive.rsdetect_opaque_tokens()only wraps high-entropy token detection.- No equivalent provider-key / private-key detector set exists in the new engine.
Why this matters
This narrows practical coverage for exactly the kinds of secrets KeyClaw is supposed to intercept in AI prompts:
- AWS access key IDs like
AKIA... - provider-prefixed tokens that are structured rather than purely high-entropy
- PEM / OpenSSH / age private-key material
Those were major strengths of the old project and are worth extracting into the new architecture rather than leaving behind.
Proposed extraction
Port a curated subset of the old master detector corpus into the new sensitive.rs engine while keeping the new format-preserving placeholder design and session-scoped store.
Minimum bar:
- Restore explicit detection for common cloud/provider credentials
- Restore explicit detection for private-key / PEM material
- Add regression tests showing the new runtime rewrites those cases again
Suggested regression tests
- AWS access key ID in a chat message is rewritten even when entropy alone would not catch it
- PEM private key block is rewritten
- A provider-specific prefixed token is rewritten without relying only on entropy