🦀 Epic: Rust-Powered PII Filter Plugin - 5-10x Performance Improvement

# 🦀 Epic: Rust-Powered PII Filter Plugin - 5-10x Performance Improvement

## Goal

Rewrite the performance-critical components of the **PII Filter Plugin** in Rust using PyO3, achieving **5-10x speedup** while maintaining 100% behavioral compatibility with the existing Python implementation. This will serve as the foundation and proof-of-concept for migrating other compute-intensive plugins to Rust.

## Why Now?

The PII Filter Plugin is the **highest-impact candidate** for Rust optimization:

1. **Performance Bottleneck**: Currently processes 12+ regex patterns on every string, adding ~10ms overhead per request
2. **High Usage**: Runs on ~80% of requests via `tool_pre_invoke` and `tool_post_invoke` hooks
3. **Complex Processing**: Deep recursive scanning of nested JSON/dict/list structures with string manipulation
4. **Scalability**: Performance degrades linearly with payload size (1KB → 10ms, 10KB → 100ms, 100KB → 1000ms)
5. **Security-Critical**: Must be fast and reliable for production PII detection
6. **Template for Others**: Success here proves PyO3 integration pattern for 10+ other plugins

**Expected Impact**: 5-10x speedup on PII filter alone translates to **15-20% overall gateway throughput improvement** since it runs on 80% of requests.

---

## 📖 User Stories

<details>
<summary>US-1: Developer - Transparent Rust Acceleration</summary>

**As a** Developer
**I want** the PII filter to automatically use Rust when available, falling back to Python when not
**So that** I get maximum performance without changing my code

**Acceptance Criteria:**

```gherkin
Given I have installed mcpgateway with Rust extensions
When I enable the PII filter plugin in plugins/config.yaml
Then the plugin should automatically detect and use the Rust implementation
And log: "PII Filter using Rust implementation (5-10x faster)"
And process requests with <2ms PII detection overhead

Given I have installed mcpgateway without Rust extensions (pure Python)
When I enable the PII filter plugin
Then the plugin should gracefully fall back to Python implementation
And log: "PII Filter using Python implementation (Rust not available)"
And warn: "Install mcpgateway[rust] for 5-10x better performance"
And process requests with ~10ms PII detection overhead (existing behavior)

Given I want to force Python implementation for debugging
When I set environment variable: MCPGATEWAY_FORCE_PYTHON_PLUGINS=true
Then the plugin should use Python implementation even if Rust is available
And log: "PII Filter using Python implementation (forced)"
```

**Technical Requirements:**
- Auto-detection of Rust binary at plugin initialization
- Graceful fallback with informative logging
- Environment variable override for debugging
- Zero code changes required for existing users
- Same configuration format (YAML) for both implementations

</details>

<details>
<summary>US-2: Platform Admin - Performance Monitoring</summary>

**As a** Platform Administrator
**I want** to see performance metrics comparing Rust vs Python PII filter
**So that** I can validate the speedup and justify the migration

**Acceptance Criteria:**

```gherkin
Given the PII filter is using Rust implementation
When I check the /metrics endpoint
Then I should see Prometheus metrics:
 - pii_filter_detections_duration_seconds{implementation="rust"} (histogram)
 - pii_filter_masking_duration_seconds{implementation="rust"} (histogram)
 - pii_filter_detections_total{implementation="rust"} (counter)
 - pii_filter_implementation{version="rust|python"} (gauge)

When I compare Rust vs Python metrics over 1000 requests
Then I should observe:
 - Rust P50 latency: <1ms (vs Python: ~5ms)
 - Rust P95 latency: <2ms (vs Python: ~15ms)
 - Rust P99 latency: <3ms (vs Python: ~30ms)
 - 5-10x throughput improvement on PII-heavy workloads

When I enable detailed profiling: MCPGATEWAY_PROFILE_PLUGINS=true
Then I should see detailed timing breakdowns:
 - Pattern compilation: <0.1ms (one-time cost)
 - Regex matching: <0.5ms per 1KB payload
 - Masking operations: <0.3ms per 1KB payload
 - JSON traversal: <0.2ms per 1KB payload
```

**Technical Requirements:**
- Prometheus metrics for both implementations
- Detailed timing instrumentation with tracing
- Performance comparison dashboard
- Memory usage tracking (Rust should use ~same or less memory)

</details>

<details>
<summary>US-3: Security Engineer - Behavioral Compatibility</summary>

**As a** Security Engineer
**I want** the Rust PII filter to produce identical results to the Python version
**So that** I can trust the migration without security regressions

**Acceptance Criteria:**

```gherkin
Given a test corpus of 1000+ PII detection cases covering all pattern types:
 - SSNs (123-45-6789, 123456789)
 - Credit cards (4111-1111-1111-1111, 4111111111111111)
 - Emails (user@example.com, user+tag@subdomain.example.co.uk)
 - Phone numbers (US, international, various formats)
 - IP addresses (IPv4, IPv6)
 - Dates of birth (MM/DD/YYYY, labeled formats)
 - Passports (A123456, AB1234567)
 - Bank accounts (IBAN, generic account numbers)
 - Medical records (MRN: 123456)
 - AWS keys (AKIAIOSFODNN7EXAMPLE)
 - API keys (api_key: abc123...)
 - Nested JSON structures
 - Unicode edge cases
 - Malformed data

When I run differential testing: Python output vs Rust output
Then the outputs must be 100% identical for:
 - Detection locations (start, end positions)
 - Detection types (SSN, credit_card, email, etc.)
 - Detection counts
 - Masked output strings
 - Partial masking formats (e.g., ***-**-1234 for SSNs)

When I run property-based fuzzing with 10,000+ random inputs
Then both implementations must:
 - Return same detection count
 - Apply same masking strategy
 - Handle edge cases identically (empty strings, null bytes, huge payloads)
 - Never crash or panic

When I test with real-world payloads from production logs (sanitized)
Then Rust must detect all PII that Python detects
And Rust must not introduce false positives
And masking output must be byte-for-byte identical
```

**Technical Requirements:**
- Comprehensive differential testing suite
- Property-based testing with `proptest` (Rust) and `hypothesis` (Python)
- Fuzzing with AFL/libFuzzer for edge cases
- Real-world payload regression tests
- CI/CD gates: Rust tests must pass before merge

</details>

<details>
<summary>US-4: Developer - Easy Installation and Distribution</summary>

**As a** Developer
**I want** to install the Rust-accelerated PII filter with a single command
**So that** I don't have to manually compile Rust code

**Acceptance Criteria:**

```gherkin
Given I am on Linux x86_64, macOS x86_64/ARM64, or Windows x86_64
When I run: pip install mcpgateway[rust]
Then it should install pre-compiled wheels with Rust extensions
And the installation should complete in <60 seconds
And no Rust toolchain should be required

Given I am on an unsupported platform (e.g., ARM32, BSD)
When I run: pip install mcpgateway[rust]
Then it should install the pure Python version
And warn: "Pre-compiled Rust binaries not available for this platform"
And provide instructions for building from source

Given I want to build from source
When I run: pip install mcpgateway[rust] --no-binary :all:
Then it should compile Rust extensions using maturin
And require: Rust toolchain installed (rustc, cargo)
And compile in release mode with optimizations

Given I am using Docker
When I use the official mcpgateway Docker image
Then it should include pre-compiled Rust extensions
And use Rust implementation by default
And be based on Python 3.11+ slim image
```

**Technical Requirements:**
- Pre-compiled wheels for: Linux x86_64, macOS x86_64, macOS ARM64, Windows x86_64
- maturin build configuration for multiple platforms
- CI/CD pipeline for building wheels on all platforms
- PyPI distribution with platform-specific wheels
- Docker image with Rust extensions included
- Fallback to pure Python on unsupported platforms

</details>

<details>
<summary>US-5: Contributor - Maintainable Rust Codebase</summary>

**As a** Contributor
**I want** the Rust PII filter code to be well-documented and maintainable
**So that** I can understand and extend it easily

**Acceptance Criteria:**

```gherkin
Given I am reviewing the Rust PII filter codebase
When I read the code structure
Then I should find:
 - Clear module organization (detector.rs, masking.rs, patterns.rs, lib.rs)
 - Comprehensive rustdoc comments on all public functions
 - Examples in doc comments showing usage
 - Type annotations for all function signatures
 - Error handling with Result<T, E> types

When I run: cargo doc --open
Then I should see generated documentation with:
 - Module-level overview of PII detection architecture
 - Function-level docs with parameter descriptions
 - Examples showing how to use each function
 - Links to relevant Python code for comparison

When I run: cargo clippy
Then there should be zero warnings (all clippy suggestions addressed)

When I run: cargo fmt --check
Then code should be formatted according to Rust style guide

When I add a new PII pattern
Then I should be able to:
 - Add the pattern to patterns.rs with clear comments
 - Write unit tests in the same file using #[cfg(test)]
 - Run cargo test to verify the pattern works
 - See code coverage report showing new pattern is tested
```

**Technical Requirements:**
- Rustdoc comments on all public APIs
- Examples in doc comments (tested with `cargo test --doc`)
- cargo clippy passing with zero warnings
- cargo fmt for consistent formatting
- Unit tests with >90% code coverage
- Integration tests for Python ↔ Rust boundary
- README.md in plugins_rust/ explaining architecture

</details>

<details>
<summary>US-6: Performance Engineer - Benchmarking and Profiling</summary>

**As a** Performance Engineer
**I want** to benchmark and profile the Rust PII filter
**So that** I can identify bottlenecks and optimize further

**Acceptance Criteria:**

```gherkin
Given I want to benchmark the PII filter
When I run: cargo bench
Then I should see criterion benchmark results for:
 - Pattern compilation (one-time cost)
 - Regex matching on 1KB, 10KB, 100KB payloads
 - Masking operations (redact, partial, hash, tokenize)
 - Nested JSON traversal (1 level, 3 levels, 5 levels)
 - Full pipeline: detect + mask

When I compare benchmark results to Python baseline
Then I should see:
 - 1KB payload: Rust 0.5ms vs Python 5ms (10x speedup)
 - 10KB payload: Rust 2ms vs Python 50ms (25x speedup)
 - 100KB payload: Rust 15ms vs Python 500ms (33x speedup)

When I run: cargo flamegraph --bin bench_pii_filter
Then I should see a flamegraph showing:
 - Hot paths (where time is spent)
 - Regex matching dominating CPU time
 - Minimal overhead from PyO3 bindings (<5%)

When I profile memory usage with valgrind/heaptrack
Then I should see:
 - Memory usage comparable to Python (within 20%)
 - No memory leaks over 10,000 iterations
 - Efficient string handling with Cow<str> (copy-on-write)

When I optimize based on profiling data
Then I should document:
 - What was optimized (e.g., used RegexSet for parallel matching)
 - Measured improvement (e.g., 2x faster regex matching)
 - Benchmarks before and after
```

**Technical Requirements:**
- Criterion benchmark suite in benches/pii_filter.rs
- Flamegraph integration for CPU profiling
- Memory profiling with valgrind or heaptrack
- Baseline benchmarks for Python implementation
- Documentation of optimization decisions
- Performance regression tests in CI

</details>

<details>
<summary>US-7: DevOps Engineer - CI/CD Integration</summary>

**As a** DevOps Engineer
**I want** automated CI/CD for building and testing Rust plugins
**So that** releases are reliable and reproducible

**Acceptance Criteria:**

```gherkin
Given a pull request modifying Rust PII filter code
When CI runs on GitHub Actions
Then it should:
 - Build Rust extensions for Linux, macOS, Windows
 - Run cargo test on all platforms
 - Run cargo clippy and enforce zero warnings
 - Run cargo fmt --check
 - Run Python integration tests with Rust extensions
 - Run differential tests: Rust vs Python outputs
 - Generate code coverage report (codecov.io)
 - Build wheels with maturin for all platforms

When all tests pass
Then the PR should be mergeable
And wheels should be uploaded as GitHub artifacts

When a release is tagged (e.g., v0.9.0)
Then CI should:
 - Build release wheels for all platforms
 - Run full test suite including benchmarks
 - Publish wheels to PyPI (if tests pass)
 - Update Docker images with new Rust binaries
 - Generate release notes with performance improvements

When tests fail
Then CI should:
 - Fail the build with clear error messages
 - Report which platform failed (Linux/macOS/Windows)
 - Show detailed test output for debugging
 - Not publish wheels or Docker images
```

**Technical Requirements:**
- GitHub Actions workflow: .github/workflows/rust-plugins.yml
- Matrix build for: Linux x86_64, macOS x86_64/ARM64, Windows x86_64
- Python versions: 3.11, 3.12, 3.13
- Rust toolchain: stable channel
- maturin for building wheels
- Automated PyPI publishing on release tags
- Docker image builds with Rust extensions
- Code coverage reporting to codecov.io

</details>

<details>
<summary>US-8: Security Auditor - Secure Rust Implementation</summary>

**As a** Security Auditor
**I want** the Rust PII filter to be free of security vulnerabilities
**So that** it doesn't introduce new attack vectors

**Acceptance Criteria:**

```gherkin
Given the Rust PII filter codebase
When I run: cargo audit
Then there should be zero known vulnerabilities in dependencies

When I review the code for unsafe blocks
Then I should find:
 - Zero unsafe blocks in core detection/masking logic
 - Any unavoidable unsafe blocks must be documented with safety proofs
 - PyO3 FFI is safe by design (no manual unsafe needed)

When I test with malicious inputs designed to trigger panics
Then the Rust code should:
 - Never panic in production (use Result<T, E> for error handling)
 - Catch panics at PyO3 boundary and return Python exceptions
 - Log security events for suspicious patterns

When I test for ReDoS (Regular Expression Denial of Service)
Then the Rust regex engine should:
 - Have bounded execution time (timeout after 1 second)
 - Not allow catastrophic backtracking
 - Reject overly complex patterns that could DoS

When I test for memory safety issues
Then the code should:
 - Have zero use-after-free bugs
 - Have zero buffer overflows
 - Have zero data races (pass cargo test with ThreadSanitizer)

When I review dependency supply chain
Then I should verify:
 - All dependencies from crates.io have >1000 downloads/month
 - Key dependencies (regex, serde, pyo3) have security audit history
 - Dependency versions are pinned in Cargo.lock
 - cargo deny is configured to block risky licenses and vulnerabilities
```

**Technical Requirements:**
- cargo audit in CI to check for vulnerabilities
- cargo deny configuration for license/vulnerability checks
- ThreadSanitizer and AddressSanitizer testing in CI
- Panic handling at PyO3 boundary
- Regex timeout enforcement (1 second default)
- Zero unsafe blocks (or heavily documented and audited)
- Security review before first release

</details>

---

## 🏗 Architecture

### High-Level Design

```mermaid
graph TB
 subgraph "Python Layer (mcpgateway/plugins/pii_filter/)"
 PY[PIIFilterPlugin pii_filter.py]
 PY_DETECT[Python PIIDetector Fallback Implementation]
 PY_CONFIG[PIIFilterConfig Pydantic Model]
 end

 subgraph "Rust Core (plugins_rust/src/pii_filter/)"
 RUST_LIB[lib.rs PyO3 Bindings]
 RUST_DETECT[detector.rs PIIDetector Struct]
 RUST_PATTERNS[patterns.rs Regex Compilation]
 RUST_MASK[masking.rs Masking Strategies]
 RUST_TRAVERSE[traverse.rs JSON/Dict Recursion]
 end

 PY -->|1. Try import| RUST_LIB
 RUST_LIB -->|Success| RUST_DETECT
 RUST_LIB -->|ImportError| PY_DETECT
 
 RUST_DETECT --> RUST_PATTERNS
 RUST_DETECT --> RUST_MASK
 RUST_DETECT --> RUST_TRAVERSE
 
 PY_CONFIG -->|Config| RUST_DETECT
 PY_CONFIG -->|Config| PY_DETECT
```

### Rust Module Structure

```
plugins_rust/
├── Cargo.toml # Dependencies and build config
├── pyproject.toml # maturin build config
├── README.md # Architecture documentation
├── src/
│ ├── lib.rs # PyO3 module definition
│ └── pii_filter/
│ ├── mod.rs # Module exports
│ ├── detector.rs # Core PIIDetector struct
│ ├── patterns.rs # Regex pattern compilation
│ ├── masking.rs # Masking strategies
│ ├── traverse.rs # Recursive JSON/dict traversal
│ └── config.rs # Configuration types
├── benches/
│ └── pii_filter.rs # Criterion benchmarks
└── tests/
 └── integration.rs # Integration tests
```

### Data Flow: Python → Rust → Python

```rust
// 1. Python calls Rust via PyO3
#[pyfunction]
fn detect_pii(text: &str, config: &PyAny) -> PyResult<HashMap<String, Vec<Detection>>> {
 // 2. Convert Python config to Rust struct
 let rust_config: PIIConfig = config.extract()?;
 
 // 3. Create detector (patterns compiled once)
 let detector = PIIDetector::new(rust_config);
 
 // 4. Detect PII (fast Rust code)
 let detections = detector.detect(text);
 
 // 5. Convert Rust HashMap back to Python dict
 Ok(detections)
}
```

### Performance Optimizations

**1. RegexSet for Parallel Matching**
```rust
// Instead of testing each pattern sequentially:
// Python: O(N patterns × M text length)
for pattern in patterns {
 if pattern.search(text) { ... }
}

// Use RegexSet for parallel matching:
// Rust: O(M text length) - single pass!
let set = RegexSet::new(patterns)?;
let matches = set.matches(text); // All patterns tested in one pass
```

**2. Copy-on-Write Strings**
```rust
use std::borrow::Cow;

fn mask(text: &str, detections: &[Detection]) -> Cow<str> {
 if detections.is_empty() {
 // No PII detected - return reference (zero copy)
 Cow::Borrowed(text)
 } else {
 // PII detected - allocate new string with masking
 Cow::Owned(apply_masking(text, detections))
 }
}
```

**3. Zero-Copy JSON Traversal**
```rust
use serde_json::Value;

fn traverse(value: &Value) -> Vec<Detection> {
 match value {
 Value::String(s) => detect_in_string(s),
 Value::Object(map) => {
 // Traverse without cloning
 map.values().flat_map(|v| traverse(v)).collect()
 }
 Value::Array(arr) => {
 arr.iter().flat_map(|v| traverse(v)).collect()
 }
 _ => vec![],
 }
}
```

---

## 📋 Implementation Tasks

### Phase 1: Project Setup (Week 1)

- [ ] **Create Rust Project Structure**
 - [ ] `mkdir plugins_rust && cd plugins_rust`
 - [ ] `cargo init --lib`
 - [ ] Configure Cargo.toml with PyO3 dependencies
 - [ ] Configure pyproject.toml for maturin
 - [ ] Create module structure: src/pii_filter/{mod.rs, detector.rs, patterns.rs, masking.rs, traverse.rs}

- [ ] **Dependencies Configuration**
 ```toml
 [dependencies]
 pyo3 = { version = "0.20", features = ["extension-module"] }
 regex = "1.10"
 serde = { version = "1.0", features = ["derive"] }
 serde_json = "1.0"
 
 [dev-dependencies]
 criterion = "0.5"
 proptest = "1.4"
 
 [lib]
 name = "plugins_rust"
 crate-type = ["cdylib"]
 ```

- [ ] **CI/CD Pipeline Setup**
 - [ ] Create .github/workflows/rust-plugins.yml
 - [ ] Matrix build: Linux x86_64, macOS x86_64/ARM64, Windows x86_64
 - [ ] Python versions: 3.11, 3.12, 3.13
 - [ ] Run cargo test, cargo clippy, cargo fmt --check
 - [ ] Build wheels with maturin
 - [ ] Upload artifacts to GitHub

- [ ] **Local Development Environment**
 - [ ] Install Rust: `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh`
 - [ ] Install maturin: `pip install maturin`
 - [ ] Test build: `maturin develop --release`
 - [ ] Verify Python can import: `python -c "import plugins_rust"`

---

### Phase 2: Core Rust Implementation (Week 2-3)

- [ ] **Pattern Compilation (patterns.rs)**
 - [ ] Define PIIType enum (SSN, CreditCard, Email, Phone, etc.)
 - [ ] Define MaskingStrategy enum (Redact, Partial, Hash, Tokenize, Remove)
 - [ ] Implement compile_patterns() -> RegexSet + Vec<Regex>
 - [ ] 12 patterns: SSN, credit card, email, phone, IP, DOB, passport, driver's license, bank account, medical record, AWS key, API key
 - [ ] Unit tests for each pattern with positive and negative cases
 - [ ] Benchmark pattern compilation time (<1ms)

- [ ] **Detection Logic (detector.rs)**
 - [ ] PIIDetector struct with compiled patterns
 - [ ] impl PIIDetector::new(config: PIIConfig) -> Self
 - [ ] impl PIIDetector::detect(&self, text: &str) -> HashMap<PIIType, Vec<Detection>>
 - [ ] Use RegexSet for parallel matching (single pass)
 - [ ] Then use individual regexes for capture groups
 - [ ] Whitelist pattern support (exclude certain matches)
 - [ ] Overlap detection (don't count same span twice)
 - [ ] Unit tests with comprehensive test corpus
 - [ ] Benchmark detection on 1KB, 10KB, 100KB payloads

- [ ] **Masking Logic (masking.rs)**
 - [ ] impl mask(text: &str, detections: &HashMap, strategy: MaskingStrategy) -> String
 - [ ] Redact strategy: Replace with [REDACTED]
 - [ ] Partial strategy: Show first/last chars (e.g., ***-**-1234 for SSN)
 - [ ] Hash strategy: Replace with SHA256 hash (first 8 chars)
 - [ ] Tokenize strategy: Replace with UUID token
 - [ ] Remove strategy: Delete entirely
 - [ ] Efficient multi-replacement using reverse iteration
 - [ ] Use Cow<str> for zero-copy when no masking needed
 - [ ] Unit tests for each masking strategy
 - [ ] Benchmark masking operations (<0.3ms per 1KB)

- [ ] **JSON Traversal (traverse.rs)**
 - [ ] impl process_nested(data: &Value, detector: &PIIDetector) -> (bool, Value, Vec<Detection>)
 - [ ] Recursive traversal of JSON structures (Object, Array, String)
 - [ ] Detect JSON strings within strings (parse and traverse)
 - [ ] Apply masking to detected PII
 - [ ] Return (modified: bool, new_value: Value, all_detections: Vec)
 - [ ] Handle nested structures up to 10 levels deep
 - [ ] Unit tests with complex nested JSON
 - [ ] Benchmark traversal on realistic payloads

- [ ] **Configuration (config.rs)**
 - [ ] PIIConfig struct mirroring Python PIIFilterConfig
 - [ ] Serde derive for deserialization from Python
 - [ ] Default values matching Python defaults
 - [ ] Validation logic for config fields

---

### Phase 3: PyO3 Bindings (Week 3)

- [ ] **Python Module Definition (lib.rs)**
 - [ ] `#[pymodule]` declaration for plugins_rust
 - [ ] Export PIIDetector as Python class
 - [ ] Export detect_pii, mask_pii, process_nested as functions

- [ ] **PIIDetector Python Class**
 ```rust
 #[pyclass]
 struct PIIDetector {
 inner: detector::PIIDetector,
 }
 
 #[pymethods]
 impl PIIDetector {
 #[new]
 fn new(config: &PyAny) -> PyResult<Self> {
 // Extract Python config to Rust struct
 let rust_config: PIIConfig = config.extract()?;
 Ok(Self {
 inner: detector::PIIDetector::new(rust_config),
 })
 }
 
 fn detect(&self, text: &str) -> PyResult<HashMap<String, Vec<PyDetection>>> {
 // Call Rust detector
 let detections = self.inner.detect(text);
 
 // Convert to Python types
 Ok(convert_to_python(detections))
 }
 
 fn mask(&self, text: &str, detections: &PyAny) -> PyResult<String> {
 // Convert Python detections to Rust
 let rust_detections = convert_from_python(detections)?;
 
 // Call Rust masking
 Ok(self.inner.mask(text, &rust_detections))
 }
 }
 ```

- [ ] **Type Conversions**
 - [ ] Python dict → Rust HashMap
 - [ ] Python list → Rust Vec
 - [ ] Python str → Rust &str
 - [ ] Rust Result → Python exceptions
 - [ ] Handle None/null values

- [ ] **Error Handling**
 - [ ] Catch Rust panics at boundary
 - [ ] Convert to Python exceptions
 - [ ] Preserve error messages and context

---

### Phase 4: Python Integration (Week 4)

- [ ] **Modify Python Plugin (plugins/pii_filter/pii_filter.py)**
 - [ ] Try importing Rust module at top of file
 ```python
 try:
 from plugins_rust import PIIDetector as RustPIIDetector
 USE_RUST = True
 logger.info("PII Filter using Rust implementation (5-10x faster)")
 except ImportError as e:
 USE_RUST = False
 logger.warning(f"PII Filter using Python implementation: {e}")
 logger.warning("Install mcpgateway[rust] for 5-10x better performance")
 ```

- [ ] **Detector Selection Logic**
 ```python
 def __init__(self, config: PluginConfig):
 super().__init__(config)
 self.pii_config = PIIFilterConfig.model_validate(self._config.config)
 
 # Check environment variable override
 force_python = os.getenv("MCPGATEWAY_FORCE_PYTHON_PLUGINS", "false").lower() == "true"
 
 if USE_RUST and not force_python:
 self.detector = RustPIIDetector(self.pii_config.model_dump())
 self.implementation = "rust"
 else:
 self.detector = PIIDetector(self.pii_config) # Python implementation
 self.implementation = "python"
 
 logger.info(f"PII Filter initialized with {self.implementation} implementation")
 ```

- [ ] **Metrics Integration**
 - [ ] Add `implementation` label to Prometheus metrics
 - [ ] Track detection duration by implementation
 - [ ] Track masking duration by implementation
 - [ ] Track detection counts by implementation

- [ ] **Backward Compatibility**
 - [ ] Ensure same API for both Rust and Python detectors
 - [ ] Same return types (dicts, lists)
 - [ ] Same exception types
 - [ ] Same configuration format

---

### Phase 5: Testing (Week 4-5)

- [ ] **Unit Tests (Rust)**
 - [ ] tests/ directory with integration tests
 - [ ] Test each pattern individually (50+ test cases)
 - [ ] Test masking strategies (20+ test cases)
 - [ ] Test JSON traversal (30+ test cases)
 - [ ] Test edge cases: empty strings, null bytes, Unicode
 - [ ] Run with: `cargo test`

- [ ] **Integration Tests (Python ↔ Rust)**
 - [ ] tests/unit/mcpgateway/plugins/test_pii_filter_rust.py
 - [ ] Test Python can import Rust module
 - [ ] Test config conversion (Python → Rust)
 - [ ] Test detection results (Rust → Python)
 - [ ] Test error handling (Rust exceptions → Python)

- [ ] **Differential Testing**
 - [ ] tests/differential/test_pii_filter_differential.py
 - [ ] 1000+ test cases covering all pattern types
 - [ ] Run same input through both Python and Rust
 - [ ] Assert outputs are identical (byte-for-byte)
 - [ ] Test corpus: SSNs, credit cards, emails, phones, etc.
 - [ ] Test nested JSON structures
 - [ ] Test real-world payloads from production (sanitized)

- [ ] **Property-Based Testing**
 - [ ] Use proptest (Rust) to generate random inputs
 - [ ] Test invariants: detection count ≥ 0, no crashes, etc.
 - [ ] Use hypothesis (Python) to generate test cases
 - [ ] Run 10,000+ iterations with random data

- [ ] **Fuzzing**
 - [ ] Use cargo-fuzz or AFL for Rust fuzzing
 - [ ] Fuzz detect() function with random strings
 - [ ] Fuzz mask() function with random detections
 - [ ] Fuzz JSON traversal with malformed JSON
 - [ ] Ensure no panics or crashes after 1M+ iterations

- [ ] **Performance Testing**
 - [ ] tests/performance/test_pii_filter_benchmark.py
 - [ ] Benchmark Python vs Rust with 1KB, 10KB, 100KB payloads
 - [ ] Measure P50, P95, P99 latencies
 - [ ] Assert Rust is 5-10x faster than Python
 - [ ] Generate performance comparison report

---

### Phase 6: Benchmarking (Week 5)

- [ ] **Criterion Benchmarks (Rust)**
 - [ ] benches/pii_filter.rs
 - [ ] Benchmark pattern compilation (one-time cost)
 - [ ] Benchmark detection on various payload sizes
 - [ ] Benchmark masking operations
 - [ ] Benchmark JSON traversal
 - [ ] Run with: `cargo bench`
 - [ ] Generate HTML reports in target/criterion/

- [ ] **Python Baseline Benchmarks**
 - [ ] benchmarks/baseline_python.py
 - [ ] Measure Python implementation performance
 - [ ] Same test cases as Rust benchmarks
 - [ ] Generate comparison charts

- [ ] **Flamegraph Profiling**
 - [ ] Install cargo-flamegraph: `cargo install flamegraph`
 - [ ] Generate flamegraph: `cargo flamegraph --bench pii_filter`
 - [ ] Identify hot paths (should be regex matching)
 - [ ] Verify PyO3 overhead is <5%

- [ ] **Memory Profiling**
 - [ ] Use valgrind or heaptrack
 - [ ] Measure memory usage: Rust vs Python
 - [ ] Check for memory leaks over 10,000 iterations
 - [ ] Verify memory usage is comparable (within 20%)

---

### Phase 7: Documentation (Week 6)

- [ ] **Rust API Documentation**
 - [ ] Rustdoc comments on all public functions
 - [ ] Module-level overview in src/pii_filter/mod.rs
 - [ ] Examples in doc comments
 - [ ] Generate docs: `cargo doc --open`
 - [ ] Ensure 100% documentation coverage

- [ ] **Python Integration Guide**
 - [ ] Update plugins/pii_filter/README.md
 - [ ] Document Rust implementation benefits
 - [ ] Installation instructions: `pip install mcpgateway[rust]`
 - [ ] Configuration examples (same format for both)
 - [ ] Troubleshooting: fallback to Python, forced Python mode

- [ ] **Performance Comparison**
 - [ ] Create docs/rust-plugins-performance.md
 - [ ] Benchmark results: Rust vs Python
 - [ ] Charts showing speedup across payload sizes
 - [ ] Memory usage comparison
 - [ ] When to use Rust vs Python

- [ ] **Contributing Guide**
 - [ ] Update CONTRIBUTING.md for Rust plugins
 - [ ] How to set up Rust development environment
 - [ ] How to build and test locally
 - [ ] How to run benchmarks
 - [ ] Code style: cargo fmt, cargo clippy

---

### Phase 8: Distribution (Week 6-7)

- [ ] **Build Wheels for All Platforms**
 - [ ] Linux x86_64: maturin build --release --target x86_64-unknown-linux-gnu
 - [ ] macOS x86_64: maturin build --release --target x86_64-apple-darwin
 - [ ] macOS ARM64: maturin build --release --target aarch64-apple-darwin
 - [ ] Windows x86_64: maturin build --release --target x86_64-pc-windows-msvc
 - [ ] Test wheels on each platform

- [ ] **PyPI Publishing**
 - [ ] Configure maturin for PyPI
 - [ ] Test upload to TestPyPI first
 - [ ] Publish to PyPI: `maturin publish`
 - [ ] Verify installation: `pip install mcpgateway[rust]`

- [ ] **Docker Images**
 - [ ] Update Dockerfile to include Rust toolchain
 - [ ] Build and include Rust extensions in image
 - [ ] Test that Rust implementation is used by default
 - [ ] Publish to Docker Hub / GitHub Container Registry

- [ ] **Fallback Handling**
 - [ ] Test on unsupported platform (e.g., ARM32)
 - [ ] Verify graceful fallback to Python
 - [ ] Verify informative warning message
 - [ ] Document supported platforms in README

---

### Phase 9: Monitoring & Observability (Week 7)

- [ ] **Prometheus Metrics**
 - [ ] pii_filter_detections_duration_seconds{implementation="rust|python"}
 - [ ] pii_filter_masking_duration_seconds{implementation="rust|python"}
 - [ ] pii_filter_detections_total{implementation="rust|python"}
 - [ ] pii_filter_implementation{version="rust|python"} (gauge)
 - [ ] pii_filter_pattern_matches_total{pattern="ssn|email|..."} (counter)

- [ ] **Structured Logging**
 - [ ] Log implementation choice at startup
 - [ ] Log performance metrics (P50, P95, P99)
 - [ ] Log detection counts by type
 - [ ] Log errors and warnings with context

- [ ] **Tracing Integration**
 - [ ] Add OpenTelemetry spans for Rust operations
 - [ ] Trace detect(), mask(), process_nested()
 - [ ] Include payload size and detection count in spans

- [ ] **Grafana Dashboard**
 - [ ] Create dashboard for PII filter metrics
 - [ ] Compare Rust vs Python performance
 - [ ] Show detection rates by pattern type
 - [ ] Alert on performance degradation

---

### Phase 10: Security & Audit (Week 7)

- [ ] **Dependency Audit**
 - [ ] Run cargo audit to check for vulnerabilities
 - [ ] Configure cargo-deny to block risky dependencies
 - [ ] Review dependency supply chain (downloads, maintainers)
 - [ ] Pin dependency versions in Cargo.lock

- [ ] **Code Review**
 - [ ] Review for unsafe blocks (should be zero)
 - [ ] Review for panics (should use Result<T, E>)
 - [ ] Review error handling at PyO3 boundary
 - [ ] Review regex patterns for ReDoS vulnerabilities

- [ ] **Sanitizer Testing**
 - [ ] Run with AddressSanitizer: `cargo test --target x86_64-unknown-linux-gnu -Zsanitizer=address`
 - [ ] Run with ThreadSanitizer: `cargo test -Zsanitizer=thread`
 - [ ] Run with MemorySanitizer: `cargo test -Zsanitizer=memory`
 - [ ] Fix any reported issues

- [ ] **Security Documentation**
 - [ ] Document security considerations
 - [ ] Threat model for PII detection
 - [ ] ReDoS mitigation (regex timeout enforcement)
 - [ ] Memory safety guarantees
 - [ ] Panic handling strategy

---

## ✅ Success Criteria

### Performance Targets
- [ ] **5-10x speedup** over Python implementation
 - 1KB payload: Rust <1ms vs Python ~5ms (5x)
 - 10KB payload: Rust <2ms vs Python ~50ms (25x)
 - 100KB payload: Rust <15ms vs Python ~500ms (33x)
- [ ] **P95 latency**: <2ms for 1KB payloads
- [ ] **P99 latency**: <3ms for 1KB payloads
- [ ] **Throughput**: 500+ requests/sec per core (vs 50-100 for Python)

### Compatibility Targets
- [ ] **100% behavioral compatibility** with Python implementation
- [ ] **Zero regressions** in detection accuracy
- [ ] **Identical outputs** in differential testing (1000+ test cases)
- [ ] **Graceful fallback** to Python when Rust unavailable

### Quality Targets
- [ ] **Code coverage**: >90% for Rust code (measured with tarpaulin)
- [ ] **Zero warnings**: cargo clippy passes
- [ ] **Zero crashes**: Fuzzing with 1M+ iterations without panics
- [ ] **Zero security issues**: cargo audit passes

### Distribution Targets
- [ ] **Pre-compiled wheels** for Linux x86_64, macOS x86_64/ARM64, Windows x86_64
- [ ] **PyPI published**: `pip install mcpgateway[rust]` works
- [ ] **Docker images** include Rust extensions
- [ ] **CI/CD passing** on all platforms

---

## 🏁 Definition of Done

- [ ] Rust PII filter implemented with all 12+ patterns
- [ ] PyO3 bindings expose PIIDetector to Python
- [ ] Python plugin auto-detects and uses Rust when available
- [ ] Graceful fallback to Python with informative logging
- [ ] Differential testing shows 100% output compatibility
- [ ] Benchmarks show 5-10x speedup over Python
- [ ] All unit tests passing (Rust and Python)
- [ ] All integration tests passing
- [ ] Property-based testing and fuzzing completed
- [ ] Documentation complete (Rustdoc, README, performance guide)
- [ ] Pre-compiled wheels built for all platforms
- [ ] Published to PyPI
- [ ] Docker images updated with Rust extensions
- [ ] CI/CD pipeline passing on all platforms
- [ ] Prometheus metrics integrated
- [ ] Security audit completed (cargo audit, sanitizers)
- [ ] Code review by 2+ maintainers
- [ ] Performance validated in staging environment
- [ ] Ready for production deployment

---

## 📈 Expected Impact

### Performance Improvements
- **Individual PII filter**: 5-10x faster
- **Overall gateway throughput**: +15-20% (since PII filter runs on 80% of requests)
- **P95 latency reduction**: 50ms → 38ms (saving 12ms)
- **P99 latency reduction**: 100ms → 76ms (saving 24ms)

### Resource Efficiency
- **CPU usage**: -40% per request (less time in PII detection)
- **Memory usage**: Comparable (within 20% of Python)
- **Scalability**: Can handle 3-5x more requests with same hardware

### Cost Savings
- **Cloud costs**: -30% for CPU-bound workloads
- **Infrastructure**: Fewer servers needed for same load
- **Carbon footprint**: -30% energy consumption

---

## 🔗 Related Issues

- #1247 - Epic: Per-Virtual-Server Plugin Selection with Multi-Level RBAC
- #1245 - Epic: Security Clearance Levels Plugin
- See: todo/rust-plugins.md for full list of 10 plugins to rustify

---

## 📚 References

### PyO3 & maturin
- [PyO3 User Guide](https://pyo3.rs/)
- [PyO3 Performance Tips](https://pyo3.rs/v0.20.0/performance.html)
- [maturin Documentation](https://www.maturin.rs/)

### Rust Performance
- [Rust Regex Performance](https://docs.rs/regex/latest/regex/#performance)
- [The Rust Performance Book](https://nnethercote.github.io/perf-book/)
- [Criterion Benchmarking](https://bheisler.github.io/criterion.rs/book/)

### Similar Success Stories
- [orjson](https://github.com/ijl/orjson) - JSON library in Rust (5-10x faster)
- [polars](https://github.com/pola-rs/polars) - DataFrame library in Rust (10-100x faster)
- [ruff](https://github.com/astral-sh/ruff) - Python linter in Rust (10-100x faster)

---

## 📝 Notes

### Why Start with PII Filter?
1. **Highest impact**: Runs on 80% of requests → biggest speedup
2. **Complex enough**: Tests PyO3 integration thoroughly (regex, recursion, masking)
3. **Template for others**: Success here enables 9 more plugins
4. **Security-critical**: Demonstrates Rust safety benefits

### Future Optimizations
- [ ] SIMD optimizations for regex matching (AVX2, NEON)
- [ ] Parallel processing with rayon for large payloads
- [ ] Custom allocator (jemalloc) for better memory performance
- [ ] Compile-time regex optimization with regex-automata

---

**Last Updated**: 2025-10-14 
**Status**: Planning Phase 
**Next Milestone**: Phase 1 - Project Setup 
**Estimated Completion**: 7 weeks 
**Priority**: High (blocks 9 other Rust plugin migrations)

🦀 Epic: Rust-Powered PII Filter Plugin - 5-10x Performance Improvement #1249

Description

🦀 Epic: Rust-Powered PII Filter Plugin - 5-10x Performance Improvement

Goal

Why Now?

📖 User Stories

🏗 Architecture

High-Level Design

Rust Module Structure

Data Flow: Python → Rust → Python

Performance Optimizations

📋 Implementation Tasks

Phase 1: Project Setup (Week 1)

Phase 2: Core Rust Implementation (Week 2-3)

Phase 3: PyO3 Bindings (Week 3)

Phase 4: Python Integration (Week 4)

Phase 5: Testing (Week 4-5)

Phase 6: Benchmarking (Week 5)

Phase 7: Documentation (Week 6)

Phase 8: Distribution (Week 6-7)

Phase 9: Monitoring & Observability (Week 7)

Phase 10: Security & Audit (Week 7)

✅ Success Criteria

Performance Targets

Compatibility Targets

Quality Targets

Distribution Targets

🏁 Definition of Done

📈 Expected Impact

Performance Improvements

Resource Efficiency

Cost Savings

🔗 Related Issues

📚 References

PyO3 & maturin

Rust Performance

Similar Success Stories

📝 Notes

Why Start with PII Filter?

Future Optimizations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions