-
Notifications
You must be signed in to change notification settings - Fork 340
Description
🦀 Epic: Rust-Powered PII Filter Plugin - 5-10x Performance Improvement
Goal
Rewrite the performance-critical components of the PII Filter Plugin in Rust using PyO3, achieving 5-10x speedup while maintaining 100% behavioral compatibility with the existing Python implementation. This will serve as the foundation and proof-of-concept for migrating other compute-intensive plugins to Rust.
Why Now?
The PII Filter Plugin is the highest-impact candidate for Rust optimization:
- Performance Bottleneck: Currently processes 12+ regex patterns on every string, adding ~10ms overhead per request
- High Usage: Runs on ~80% of requests via
tool_pre_invoke
andtool_post_invoke
hooks - Complex Processing: Deep recursive scanning of nested JSON/dict/list structures with string manipulation
- Scalability: Performance degrades linearly with payload size (1KB → 10ms, 10KB → 100ms, 100KB → 1000ms)
- Security-Critical: Must be fast and reliable for production PII detection
- Template for Others: Success here proves PyO3 integration pattern for 10+ other plugins
Expected Impact: 5-10x speedup on PII filter alone translates to 15-20% overall gateway throughput improvement since it runs on 80% of requests.
📖 User Stories
US-1: Developer - Transparent Rust Acceleration
As a Developer
I want the PII filter to automatically use Rust when available, falling back to Python when not
So that I get maximum performance without changing my code
Acceptance Criteria:
Given I have installed mcpgateway with Rust extensions
When I enable the PII filter plugin in plugins/config.yaml
Then the plugin should automatically detect and use the Rust implementation
And log: "PII Filter using Rust implementation (5-10x faster)"
And process requests with <2ms PII detection overhead
Given I have installed mcpgateway without Rust extensions (pure Python)
When I enable the PII filter plugin
Then the plugin should gracefully fall back to Python implementation
And log: "PII Filter using Python implementation (Rust not available)"
And warn: "Install mcpgateway[rust] for 5-10x better performance"
And process requests with ~10ms PII detection overhead (existing behavior)
Given I want to force Python implementation for debugging
When I set environment variable: MCPGATEWAY_FORCE_PYTHON_PLUGINS=true
Then the plugin should use Python implementation even if Rust is available
And log: "PII Filter using Python implementation (forced)"
Technical Requirements:
- Auto-detection of Rust binary at plugin initialization
- Graceful fallback with informative logging
- Environment variable override for debugging
- Zero code changes required for existing users
- Same configuration format (YAML) for both implementations
US-2: Platform Admin - Performance Monitoring
As a Platform Administrator
I want to see performance metrics comparing Rust vs Python PII filter
So that I can validate the speedup and justify the migration
Acceptance Criteria:
Given the PII filter is using Rust implementation
When I check the /metrics endpoint
Then I should see Prometheus metrics:
- pii_filter_detections_duration_seconds{implementation="rust"} (histogram)
- pii_filter_masking_duration_seconds{implementation="rust"} (histogram)
- pii_filter_detections_total{implementation="rust"} (counter)
- pii_filter_implementation{version="rust|python"} (gauge)
When I compare Rust vs Python metrics over 1000 requests
Then I should observe:
- Rust P50 latency: <1ms (vs Python: ~5ms)
- Rust P95 latency: <2ms (vs Python: ~15ms)
- Rust P99 latency: <3ms (vs Python: ~30ms)
- 5-10x throughput improvement on PII-heavy workloads
When I enable detailed profiling: MCPGATEWAY_PROFILE_PLUGINS=true
Then I should see detailed timing breakdowns:
- Pattern compilation: <0.1ms (one-time cost)
- Regex matching: <0.5ms per 1KB payload
- Masking operations: <0.3ms per 1KB payload
- JSON traversal: <0.2ms per 1KB payload
Technical Requirements:
- Prometheus metrics for both implementations
- Detailed timing instrumentation with tracing
- Performance comparison dashboard
- Memory usage tracking (Rust should use ~same or less memory)
US-3: Security Engineer - Behavioral Compatibility
As a Security Engineer
I want the Rust PII filter to produce identical results to the Python version
So that I can trust the migration without security regressions
Acceptance Criteria:
Given a test corpus of 1000+ PII detection cases covering all pattern types:
- SSNs (123-45-6789, 123456789)
- Credit cards (4111-1111-1111-1111, 4111111111111111)
- Emails (user@example.com, user+tag@subdomain.example.co.uk)
- Phone numbers (US, international, various formats)
- IP addresses (IPv4, IPv6)
- Dates of birth (MM/DD/YYYY, labeled formats)
- Passports (A123456, AB1234567)
- Bank accounts (IBAN, generic account numbers)
- Medical records (MRN: 123456)
- AWS keys (AKIAIOSFODNN7EXAMPLE)
- API keys (api_key: abc123...)
- Nested JSON structures
- Unicode edge cases
- Malformed data
When I run differential testing: Python output vs Rust output
Then the outputs must be 100% identical for:
- Detection locations (start, end positions)
- Detection types (SSN, credit_card, email, etc.)
- Detection counts
- Masked output strings
- Partial masking formats (e.g., ***-**-1234 for SSNs)
When I run property-based fuzzing with 10,000+ random inputs
Then both implementations must:
- Return same detection count
- Apply same masking strategy
- Handle edge cases identically (empty strings, null bytes, huge payloads)
- Never crash or panic
When I test with real-world payloads from production logs (sanitized)
Then Rust must detect all PII that Python detects
And Rust must not introduce false positives
And masking output must be byte-for-byte identical
Technical Requirements:
- Comprehensive differential testing suite
- Property-based testing with
proptest
(Rust) andhypothesis
(Python) - Fuzzing with AFL/libFuzzer for edge cases
- Real-world payload regression tests
- CI/CD gates: Rust tests must pass before merge
US-4: Developer - Easy Installation and Distribution
As a Developer
I want to install the Rust-accelerated PII filter with a single command
So that I don't have to manually compile Rust code
Acceptance Criteria:
Given I am on Linux x86_64, macOS x86_64/ARM64, or Windows x86_64
When I run: pip install mcpgateway[rust]
Then it should install pre-compiled wheels with Rust extensions
And the installation should complete in <60 seconds
And no Rust toolchain should be required
Given I am on an unsupported platform (e.g., ARM32, BSD)
When I run: pip install mcpgateway[rust]
Then it should install the pure Python version
And warn: "Pre-compiled Rust binaries not available for this platform"
And provide instructions for building from source
Given I want to build from source
When I run: pip install mcpgateway[rust] --no-binary :all:
Then it should compile Rust extensions using maturin
And require: Rust toolchain installed (rustc, cargo)
And compile in release mode with optimizations
Given I am using Docker
When I use the official mcpgateway Docker image
Then it should include pre-compiled Rust extensions
And use Rust implementation by default
And be based on Python 3.11+ slim image
Technical Requirements:
- Pre-compiled wheels for: Linux x86_64, macOS x86_64, macOS ARM64, Windows x86_64
- maturin build configuration for multiple platforms
- CI/CD pipeline for building wheels on all platforms
- PyPI distribution with platform-specific wheels
- Docker image with Rust extensions included
- Fallback to pure Python on unsupported platforms
US-5: Contributor - Maintainable Rust Codebase
As a Contributor
I want the Rust PII filter code to be well-documented and maintainable
So that I can understand and extend it easily
Acceptance Criteria:
Given I am reviewing the Rust PII filter codebase
When I read the code structure
Then I should find:
- Clear module organization (detector.rs, masking.rs, patterns.rs, lib.rs)
- Comprehensive rustdoc comments on all public functions
- Examples in doc comments showing usage
- Type annotations for all function signatures
- Error handling with Result<T, E> types
When I run: cargo doc --open
Then I should see generated documentation with:
- Module-level overview of PII detection architecture
- Function-level docs with parameter descriptions
- Examples showing how to use each function
- Links to relevant Python code for comparison
When I run: cargo clippy
Then there should be zero warnings (all clippy suggestions addressed)
When I run: cargo fmt --check
Then code should be formatted according to Rust style guide
When I add a new PII pattern
Then I should be able to:
- Add the pattern to patterns.rs with clear comments
- Write unit tests in the same file using #[cfg(test)]
- Run cargo test to verify the pattern works
- See code coverage report showing new pattern is tested
Technical Requirements:
- Rustdoc comments on all public APIs
- Examples in doc comments (tested with
cargo test --doc
) - cargo clippy passing with zero warnings
- cargo fmt for consistent formatting
- Unit tests with >90% code coverage
- Integration tests for Python ↔ Rust boundary
- README.md in plugins_rust/ explaining architecture
US-6: Performance Engineer - Benchmarking and Profiling
As a Performance Engineer
I want to benchmark and profile the Rust PII filter
So that I can identify bottlenecks and optimize further
Acceptance Criteria:
Given I want to benchmark the PII filter
When I run: cargo bench
Then I should see criterion benchmark results for:
- Pattern compilation (one-time cost)
- Regex matching on 1KB, 10KB, 100KB payloads
- Masking operations (redact, partial, hash, tokenize)
- Nested JSON traversal (1 level, 3 levels, 5 levels)
- Full pipeline: detect + mask
When I compare benchmark results to Python baseline
Then I should see:
- 1KB payload: Rust 0.5ms vs Python 5ms (10x speedup)
- 10KB payload: Rust 2ms vs Python 50ms (25x speedup)
- 100KB payload: Rust 15ms vs Python 500ms (33x speedup)
When I run: cargo flamegraph --bin bench_pii_filter
Then I should see a flamegraph showing:
- Hot paths (where time is spent)
- Regex matching dominating CPU time
- Minimal overhead from PyO3 bindings (<5%)
When I profile memory usage with valgrind/heaptrack
Then I should see:
- Memory usage comparable to Python (within 20%)
- No memory leaks over 10,000 iterations
- Efficient string handling with Cow<str> (copy-on-write)
When I optimize based on profiling data
Then I should document:
- What was optimized (e.g., used RegexSet for parallel matching)
- Measured improvement (e.g., 2x faster regex matching)
- Benchmarks before and after
Technical Requirements:
- Criterion benchmark suite in benches/pii_filter.rs
- Flamegraph integration for CPU profiling
- Memory profiling with valgrind or heaptrack
- Baseline benchmarks for Python implementation
- Documentation of optimization decisions
- Performance regression tests in CI
US-7: DevOps Engineer - CI/CD Integration
As a DevOps Engineer
I want automated CI/CD for building and testing Rust plugins
So that releases are reliable and reproducible
Acceptance Criteria:
Given a pull request modifying Rust PII filter code
When CI runs on GitHub Actions
Then it should:
- Build Rust extensions for Linux, macOS, Windows
- Run cargo test on all platforms
- Run cargo clippy and enforce zero warnings
- Run cargo fmt --check
- Run Python integration tests with Rust extensions
- Run differential tests: Rust vs Python outputs
- Generate code coverage report (codecov.io)
- Build wheels with maturin for all platforms
When all tests pass
Then the PR should be mergeable
And wheels should be uploaded as GitHub artifacts
When a release is tagged (e.g., v0.9.0)
Then CI should:
- Build release wheels for all platforms
- Run full test suite including benchmarks
- Publish wheels to PyPI (if tests pass)
- Update Docker images with new Rust binaries
- Generate release notes with performance improvements
When tests fail
Then CI should:
- Fail the build with clear error messages
- Report which platform failed (Linux/macOS/Windows)
- Show detailed test output for debugging
- Not publish wheels or Docker images
Technical Requirements:
- GitHub Actions workflow: .github/workflows/rust-plugins.yml
- Matrix build for: Linux x86_64, macOS x86_64/ARM64, Windows x86_64
- Python versions: 3.11, 3.12, 3.13
- Rust toolchain: stable channel
- maturin for building wheels
- Automated PyPI publishing on release tags
- Docker image builds with Rust extensions
- Code coverage reporting to codecov.io
US-8: Security Auditor - Secure Rust Implementation
As a Security Auditor
I want the Rust PII filter to be free of security vulnerabilities
So that it doesn't introduce new attack vectors
Acceptance Criteria:
Given the Rust PII filter codebase
When I run: cargo audit
Then there should be zero known vulnerabilities in dependencies
When I review the code for unsafe blocks
Then I should find:
- Zero unsafe blocks in core detection/masking logic
- Any unavoidable unsafe blocks must be documented with safety proofs
- PyO3 FFI is safe by design (no manual unsafe needed)
When I test with malicious inputs designed to trigger panics
Then the Rust code should:
- Never panic in production (use Result<T, E> for error handling)
- Catch panics at PyO3 boundary and return Python exceptions
- Log security events for suspicious patterns
When I test for ReDoS (Regular Expression Denial of Service)
Then the Rust regex engine should:
- Have bounded execution time (timeout after 1 second)
- Not allow catastrophic backtracking
- Reject overly complex patterns that could DoS
When I test for memory safety issues
Then the code should:
- Have zero use-after-free bugs
- Have zero buffer overflows
- Have zero data races (pass cargo test with ThreadSanitizer)
When I review dependency supply chain
Then I should verify:
- All dependencies from crates.io have >1000 downloads/month
- Key dependencies (regex, serde, pyo3) have security audit history
- Dependency versions are pinned in Cargo.lock
- cargo deny is configured to block risky licenses and vulnerabilities
Technical Requirements:
- cargo audit in CI to check for vulnerabilities
- cargo deny configuration for license/vulnerability checks
- ThreadSanitizer and AddressSanitizer testing in CI
- Panic handling at PyO3 boundary
- Regex timeout enforcement (1 second default)
- Zero unsafe blocks (or heavily documented and audited)
- Security review before first release
🏗 Architecture
High-Level Design
graph TB
subgraph "Python Layer (mcpgateway/plugins/pii_filter/)"
PY[PIIFilterPlugin<br/>pii_filter.py]
PY_DETECT[Python PIIDetector<br/>Fallback Implementation]
PY_CONFIG[PIIFilterConfig<br/>Pydantic Model]
end
subgraph "Rust Core (plugins_rust/src/pii_filter/)"
RUST_LIB[lib.rs<br/>PyO3 Bindings]
RUST_DETECT[detector.rs<br/>PIIDetector Struct]
RUST_PATTERNS[patterns.rs<br/>Regex Compilation]
RUST_MASK[masking.rs<br/>Masking Strategies]
RUST_TRAVERSE[traverse.rs<br/>JSON/Dict Recursion]
end
PY -->|1. Try import| RUST_LIB
RUST_LIB -->|Success| RUST_DETECT
RUST_LIB -->|ImportError| PY_DETECT
RUST_DETECT --> RUST_PATTERNS
RUST_DETECT --> RUST_MASK
RUST_DETECT --> RUST_TRAVERSE
PY_CONFIG -->|Config| RUST_DETECT
PY_CONFIG -->|Config| PY_DETECT
Rust Module Structure
plugins_rust/
├── Cargo.toml # Dependencies and build config
├── pyproject.toml # maturin build config
├── README.md # Architecture documentation
├── src/
│ ├── lib.rs # PyO3 module definition
│ └── pii_filter/
│ ├── mod.rs # Module exports
│ ├── detector.rs # Core PIIDetector struct
│ ├── patterns.rs # Regex pattern compilation
│ ├── masking.rs # Masking strategies
│ ├── traverse.rs # Recursive JSON/dict traversal
│ └── config.rs # Configuration types
├── benches/
│ └── pii_filter.rs # Criterion benchmarks
└── tests/
└── integration.rs # Integration tests
Data Flow: Python → Rust → Python
// 1. Python calls Rust via PyO3
#[pyfunction]
fn detect_pii(text: &str, config: &PyAny) -> PyResult<HashMap<String, Vec<Detection>>> {
// 2. Convert Python config to Rust struct
let rust_config: PIIConfig = config.extract()?;
// 3. Create detector (patterns compiled once)
let detector = PIIDetector::new(rust_config);
// 4. Detect PII (fast Rust code)
let detections = detector.detect(text);
// 5. Convert Rust HashMap back to Python dict
Ok(detections)
}
Performance Optimizations
1. RegexSet for Parallel Matching
// Instead of testing each pattern sequentially:
// Python: O(N patterns × M text length)
for pattern in patterns {
if pattern.search(text) { ... }
}
// Use RegexSet for parallel matching:
// Rust: O(M text length) - single pass!
let set = RegexSet::new(patterns)?;
let matches = set.matches(text); // All patterns tested in one pass
2. Copy-on-Write Strings
use std::borrow::Cow;
fn mask(text: &str, detections: &[Detection]) -> Cow<str> {
if detections.is_empty() {
// No PII detected - return reference (zero copy)
Cow::Borrowed(text)
} else {
// PII detected - allocate new string with masking
Cow::Owned(apply_masking(text, detections))
}
}
3. Zero-Copy JSON Traversal
use serde_json::Value;
fn traverse(value: &Value) -> Vec<Detection> {
match value {
Value::String(s) => detect_in_string(s),
Value::Object(map) => {
// Traverse without cloning
map.values().flat_map(|v| traverse(v)).collect()
}
Value::Array(arr) => {
arr.iter().flat_map(|v| traverse(v)).collect()
}
_ => vec![],
}
}
📋 Implementation Tasks
Phase 1: Project Setup (Week 1)
-
Create Rust Project Structure
-
mkdir plugins_rust && cd plugins_rust
-
cargo init --lib
- Configure Cargo.toml with PyO3 dependencies
- Configure pyproject.toml for maturin
- Create module structure: src/pii_filter/{mod.rs, detector.rs, patterns.rs, masking.rs, traverse.rs}
-
-
Dependencies Configuration
[dependencies] pyo3 = { version = "0.20", features = ["extension-module"] } regex = "1.10" serde = { version = "1.0", features = ["derive"] } serde_json = "1.0" [dev-dependencies] criterion = "0.5" proptest = "1.4" [lib] name = "plugins_rust" crate-type = ["cdylib"]
-
CI/CD Pipeline Setup
- Create .github/workflows/rust-plugins.yml
- Matrix build: Linux x86_64, macOS x86_64/ARM64, Windows x86_64
- Python versions: 3.11, 3.12, 3.13
- Run cargo test, cargo clippy, cargo fmt --check
- Build wheels with maturin
- Upload artifacts to GitHub
-
Local Development Environment
- Install Rust:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Install maturin:
pip install maturin
- Test build:
maturin develop --release
- Verify Python can import:
python -c "import plugins_rust"
- Install Rust:
Phase 2: Core Rust Implementation (Week 2-3)
-
Pattern Compilation (patterns.rs)
- Define PIIType enum (SSN, CreditCard, Email, Phone, etc.)
- Define MaskingStrategy enum (Redact, Partial, Hash, Tokenize, Remove)
- Implement compile_patterns() -> RegexSet + Vec
- 12 patterns: SSN, credit card, email, phone, IP, DOB, passport, driver's license, bank account, medical record, AWS key, API key
- Unit tests for each pattern with positive and negative cases
- Benchmark pattern compilation time (<1ms)
-
Detection Logic (detector.rs)
- PIIDetector struct with compiled patterns
- impl PIIDetector::new(config: PIIConfig) -> Self
- impl PIIDetector::detect(&self, text: &str) -> HashMap<PIIType, Vec>
- Use RegexSet for parallel matching (single pass)
- Then use individual regexes for capture groups
- Whitelist pattern support (exclude certain matches)
- Overlap detection (don't count same span twice)
- Unit tests with comprehensive test corpus
- Benchmark detection on 1KB, 10KB, 100KB payloads
-
Masking Logic (masking.rs)
- impl mask(text: &str, detections: &HashMap, strategy: MaskingStrategy) -> String
- Redact strategy: Replace with [REDACTED]
- Partial strategy: Show first/last chars (e.g., *--1234 for SSN)
- Hash strategy: Replace with SHA256 hash (first 8 chars)
- Tokenize strategy: Replace with UUID token
- Remove strategy: Delete entirely
- Efficient multi-replacement using reverse iteration
- Use Cow for zero-copy when no masking needed
- Unit tests for each masking strategy
- Benchmark masking operations (<0.3ms per 1KB)
-
JSON Traversal (traverse.rs)
- impl process_nested(data: &Value, detector: &PIIDetector) -> (bool, Value, Vec)
- Recursive traversal of JSON structures (Object, Array, String)
- Detect JSON strings within strings (parse and traverse)
- Apply masking to detected PII
- Return (modified: bool, new_value: Value, all_detections: Vec)
- Handle nested structures up to 10 levels deep
- Unit tests with complex nested JSON
- Benchmark traversal on realistic payloads
-
Configuration (config.rs)
- PIIConfig struct mirroring Python PIIFilterConfig
- Serde derive for deserialization from Python
- Default values matching Python defaults
- Validation logic for config fields
Phase 3: PyO3 Bindings (Week 3)
-
Python Module Definition (lib.rs)
-
#[pymodule]
declaration for plugins_rust - Export PIIDetector as Python class
- Export detect_pii, mask_pii, process_nested as functions
-
-
PIIDetector Python Class
#[pyclass] struct PIIDetector { inner: detector::PIIDetector, } #[pymethods] impl PIIDetector { #[new] fn new(config: &PyAny) -> PyResult<Self> { // Extract Python config to Rust struct let rust_config: PIIConfig = config.extract()?; Ok(Self { inner: detector::PIIDetector::new(rust_config), }) } fn detect(&self, text: &str) -> PyResult<HashMap<String, Vec<PyDetection>>> { // Call Rust detector let detections = self.inner.detect(text); // Convert to Python types Ok(convert_to_python(detections)) } fn mask(&self, text: &str, detections: &PyAny) -> PyResult<String> { // Convert Python detections to Rust let rust_detections = convert_from_python(detections)?; // Call Rust masking Ok(self.inner.mask(text, &rust_detections)) } }
-
Type Conversions
- Python dict → Rust HashMap
- Python list → Rust Vec
- Python str → Rust &str
- Rust Result → Python exceptions
- Handle None/null values
-
Error Handling
- Catch Rust panics at boundary
- Convert to Python exceptions
- Preserve error messages and context
Phase 4: Python Integration (Week 4)
-
Modify Python Plugin (plugins/pii_filter/pii_filter.py)
- Try importing Rust module at top of file
try: from plugins_rust import PIIDetector as RustPIIDetector USE_RUST = True logger.info("PII Filter using Rust implementation (5-10x faster)") except ImportError as e: USE_RUST = False logger.warning(f"PII Filter using Python implementation: {e}") logger.warning("Install mcpgateway[rust] for 5-10x better performance")
-
Detector Selection Logic
def __init__(self, config: PluginConfig): super().__init__(config) self.pii_config = PIIFilterConfig.model_validate(self._config.config) # Check environment variable override force_python = os.getenv("MCPGATEWAY_FORCE_PYTHON_PLUGINS", "false").lower() == "true" if USE_RUST and not force_python: self.detector = RustPIIDetector(self.pii_config.model_dump()) self.implementation = "rust" else: self.detector = PIIDetector(self.pii_config) # Python implementation self.implementation = "python" logger.info(f"PII Filter initialized with {self.implementation} implementation")
-
Metrics Integration
- Add
implementation
label to Prometheus metrics - Track detection duration by implementation
- Track masking duration by implementation
- Track detection counts by implementation
- Add
-
Backward Compatibility
- Ensure same API for both Rust and Python detectors
- Same return types (dicts, lists)
- Same exception types
- Same configuration format
Phase 5: Testing (Week 4-5)
-
Unit Tests (Rust)
- tests/ directory with integration tests
- Test each pattern individually (50+ test cases)
- Test masking strategies (20+ test cases)
- Test JSON traversal (30+ test cases)
- Test edge cases: empty strings, null bytes, Unicode
- Run with:
cargo test
-
Integration Tests (Python ↔ Rust)
- tests/unit/mcpgateway/plugins/test_pii_filter_rust.py
- Test Python can import Rust module
- Test config conversion (Python → Rust)
- Test detection results (Rust → Python)
- Test error handling (Rust exceptions → Python)
-
Differential Testing
- tests/differential/test_pii_filter_differential.py
- 1000+ test cases covering all pattern types
- Run same input through both Python and Rust
- Assert outputs are identical (byte-for-byte)
- Test corpus: SSNs, credit cards, emails, phones, etc.
- Test nested JSON structures
- Test real-world payloads from production (sanitized)
-
Property-Based Testing
- Use proptest (Rust) to generate random inputs
- Test invariants: detection count ≥ 0, no crashes, etc.
- Use hypothesis (Python) to generate test cases
- Run 10,000+ iterations with random data
-
Fuzzing
- Use cargo-fuzz or AFL for Rust fuzzing
- Fuzz detect() function with random strings
- Fuzz mask() function with random detections
- Fuzz JSON traversal with malformed JSON
- Ensure no panics or crashes after 1M+ iterations
-
Performance Testing
- tests/performance/test_pii_filter_benchmark.py
- Benchmark Python vs Rust with 1KB, 10KB, 100KB payloads
- Measure P50, P95, P99 latencies
- Assert Rust is 5-10x faster than Python
- Generate performance comparison report
Phase 6: Benchmarking (Week 5)
-
Criterion Benchmarks (Rust)
- benches/pii_filter.rs
- Benchmark pattern compilation (one-time cost)
- Benchmark detection on various payload sizes
- Benchmark masking operations
- Benchmark JSON traversal
- Run with:
cargo bench
- Generate HTML reports in target/criterion/
-
Python Baseline Benchmarks
- benchmarks/baseline_python.py
- Measure Python implementation performance
- Same test cases as Rust benchmarks
- Generate comparison charts
-
Flamegraph Profiling
- Install cargo-flamegraph:
cargo install flamegraph
- Generate flamegraph:
cargo flamegraph --bench pii_filter
- Identify hot paths (should be regex matching)
- Verify PyO3 overhead is <5%
- Install cargo-flamegraph:
-
Memory Profiling
- Use valgrind or heaptrack
- Measure memory usage: Rust vs Python
- Check for memory leaks over 10,000 iterations
- Verify memory usage is comparable (within 20%)
Phase 7: Documentation (Week 6)
-
Rust API Documentation
- Rustdoc comments on all public functions
- Module-level overview in src/pii_filter/mod.rs
- Examples in doc comments
- Generate docs:
cargo doc --open
- Ensure 100% documentation coverage
-
Python Integration Guide
- Update plugins/pii_filter/README.md
- Document Rust implementation benefits
- Installation instructions:
pip install mcpgateway[rust]
- Configuration examples (same format for both)
- Troubleshooting: fallback to Python, forced Python mode
-
Performance Comparison
- Create docs/rust-plugins-performance.md
- Benchmark results: Rust vs Python
- Charts showing speedup across payload sizes
- Memory usage comparison
- When to use Rust vs Python
-
Contributing Guide
- Update CONTRIBUTING.md for Rust plugins
- How to set up Rust development environment
- How to build and test locally
- How to run benchmarks
- Code style: cargo fmt, cargo clippy
Phase 8: Distribution (Week 6-7)
-
Build Wheels for All Platforms
- Linux x86_64: maturin build --release --target x86_64-unknown-linux-gnu
- macOS x86_64: maturin build --release --target x86_64-apple-darwin
- macOS ARM64: maturin build --release --target aarch64-apple-darwin
- Windows x86_64: maturin build --release --target x86_64-pc-windows-msvc
- Test wheels on each platform
-
PyPI Publishing
- Configure maturin for PyPI
- Test upload to TestPyPI first
- Publish to PyPI:
maturin publish
- Verify installation:
pip install mcpgateway[rust]
-
Docker Images
- Update Dockerfile to include Rust toolchain
- Build and include Rust extensions in image
- Test that Rust implementation is used by default
- Publish to Docker Hub / GitHub Container Registry
-
Fallback Handling
- Test on unsupported platform (e.g., ARM32)
- Verify graceful fallback to Python
- Verify informative warning message
- Document supported platforms in README
Phase 9: Monitoring & Observability (Week 7)
-
Prometheus Metrics
- pii_filter_detections_duration_seconds{implementation="rust|python"}
- pii_filter_masking_duration_seconds{implementation="rust|python"}
- pii_filter_detections_total{implementation="rust|python"}
- pii_filter_implementation{version="rust|python"} (gauge)
- pii_filter_pattern_matches_total{pattern="ssn|email|..."} (counter)
-
Structured Logging
- Log implementation choice at startup
- Log performance metrics (P50, P95, P99)
- Log detection counts by type
- Log errors and warnings with context
-
Tracing Integration
- Add OpenTelemetry spans for Rust operations
- Trace detect(), mask(), process_nested()
- Include payload size and detection count in spans
-
Grafana Dashboard
- Create dashboard for PII filter metrics
- Compare Rust vs Python performance
- Show detection rates by pattern type
- Alert on performance degradation
Phase 10: Security & Audit (Week 7)
-
Dependency Audit
- Run cargo audit to check for vulnerabilities
- Configure cargo-deny to block risky dependencies
- Review dependency supply chain (downloads, maintainers)
- Pin dependency versions in Cargo.lock
-
Code Review
- Review for unsafe blocks (should be zero)
- Review for panics (should use Result<T, E>)
- Review error handling at PyO3 boundary
- Review regex patterns for ReDoS vulnerabilities
-
Sanitizer Testing
- Run with AddressSanitizer:
cargo test --target x86_64-unknown-linux-gnu -Zsanitizer=address
- Run with ThreadSanitizer:
cargo test -Zsanitizer=thread
- Run with MemorySanitizer:
cargo test -Zsanitizer=memory
- Fix any reported issues
- Run with AddressSanitizer:
-
Security Documentation
- Document security considerations
- Threat model for PII detection
- ReDoS mitigation (regex timeout enforcement)
- Memory safety guarantees
- Panic handling strategy
✅ Success Criteria
Performance Targets
- 5-10x speedup over Python implementation
- 1KB payload: Rust <1ms vs Python ~5ms (5x)
- 10KB payload: Rust <2ms vs Python ~50ms (25x)
- 100KB payload: Rust <15ms vs Python ~500ms (33x)
- P95 latency: <2ms for 1KB payloads
- P99 latency: <3ms for 1KB payloads
- Throughput: 500+ requests/sec per core (vs 50-100 for Python)
Compatibility Targets
- 100% behavioral compatibility with Python implementation
- Zero regressions in detection accuracy
- Identical outputs in differential testing (1000+ test cases)
- Graceful fallback to Python when Rust unavailable
Quality Targets
- Code coverage: >90% for Rust code (measured with tarpaulin)
- Zero warnings: cargo clippy passes
- Zero crashes: Fuzzing with 1M+ iterations without panics
- Zero security issues: cargo audit passes
Distribution Targets
- Pre-compiled wheels for Linux x86_64, macOS x86_64/ARM64, Windows x86_64
- PyPI published:
pip install mcpgateway[rust]
works - Docker images include Rust extensions
- CI/CD passing on all platforms
🏁 Definition of Done
- Rust PII filter implemented with all 12+ patterns
- PyO3 bindings expose PIIDetector to Python
- Python plugin auto-detects and uses Rust when available
- Graceful fallback to Python with informative logging
- Differential testing shows 100% output compatibility
- Benchmarks show 5-10x speedup over Python
- All unit tests passing (Rust and Python)
- All integration tests passing
- Property-based testing and fuzzing completed
- Documentation complete (Rustdoc, README, performance guide)
- Pre-compiled wheels built for all platforms
- Published to PyPI
- Docker images updated with Rust extensions
- CI/CD pipeline passing on all platforms
- Prometheus metrics integrated
- Security audit completed (cargo audit, sanitizers)
- Code review by 2+ maintainers
- Performance validated in staging environment
- Ready for production deployment
📈 Expected Impact
Performance Improvements
- Individual PII filter: 5-10x faster
- Overall gateway throughput: +15-20% (since PII filter runs on 80% of requests)
- P95 latency reduction: 50ms → 38ms (saving 12ms)
- P99 latency reduction: 100ms → 76ms (saving 24ms)
Resource Efficiency
- CPU usage: -40% per request (less time in PII detection)
- Memory usage: Comparable (within 20% of Python)
- Scalability: Can handle 3-5x more requests with same hardware
Cost Savings
- Cloud costs: -30% for CPU-bound workloads
- Infrastructure: Fewer servers needed for same load
- Carbon footprint: -30% energy consumption
🔗 Related Issues
- 🔌 Epic: Per-Virtual-Server Plugin Selection with Multi-Level RBAC #1247 - Epic: Per-Virtual-Server Plugin Selection with Multi-Level RBAC
- 🔌 Epic: Security Clearance Levels Plugin - Bell-LaPadula MAC Implementation #1245 - Epic: Security Clearance Levels Plugin
- See: todo/rust-plugins.md for full list of 10 plugins to rustify
📚 References
PyO3 & maturin
Rust Performance
Similar Success Stories
- orjson - JSON library in Rust (5-10x faster)
- polars - DataFrame library in Rust (10-100x faster)
- ruff - Python linter in Rust (10-100x faster)
📝 Notes
Why Start with PII Filter?
- Highest impact: Runs on 80% of requests → biggest speedup
- Complex enough: Tests PyO3 integration thoroughly (regex, recursion, masking)
- Template for others: Success here enables 9 more plugins
- Security-critical: Demonstrates Rust safety benefits
Future Optimizations
- SIMD optimizations for regex matching (AVX2, NEON)
- Parallel processing with rayon for large payloads
- Custom allocator (jemalloc) for better memory performance
- Compile-time regex optimization with regex-automata
Last Updated: 2025-10-14
Status: Planning Phase
Next Milestone: Phase 1 - Project Setup
Estimated Completion: 7 weeks
Priority: High (blocks 9 other Rust plugin migrations)