Skip to content

feat/add-redact-feature-on-regex-plugin #1292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

siddharthsambharia-portkey
Copy link
Contributor

Description

This PR enhances the Regex Match plugin by adding an optional redaction feature.
Developers can now configure the plugin to not only detect regex matches but also redact matching text with a replacement string (default: [REDACTED]).

Motivation

  • Current implementation only checks for regex matches and blocks/flags events.
  • Many use cases (PII filtering, sensitive data handling, compliance workflows) require redacting text instead of blocking requests.
  • This update aligns the Regex plugin with the PII plugin’s capabilities while maintaining backward compatibility.

Type of Change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Unit tests validating detection + redaction behavior
  • Manual tests on request and response flows with sample regex patterns
  • Verified backward compatibility with existing regex match usage

Screenshots (if applicable)

No UI impact – plugin configuration extended with redact and redactText parameters.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (plugin manifest updated)
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective
  • New and existing unit tests pass locally with my changes

Related Issues

Fixes and closes #1291

Copy link

matter-code-review bot commented Aug 19, 2025

Code Quality security vulnerability new feature

Description

Summary By MatterAI MatterAI logo

🔄 What Changed

The Regex Match plugin (regexMatch.ts) has been enhanced with a new redact feature. This allows the plugin to not only detect text matching a specified regex pattern but also to replace it with a configurable redactText (defaulting to [REDACTED]). The implementation optimizes regex handling by creating the RegExp object once and correctly resetting its lastIndex property for global matches, improving performance and ensuring accurate redaction across multiple text parts. Additionally, null/undefined checks for text excerpts have been improved.

🔍 Impact of the Change

This feature significantly expands the plugin's utility, enabling robust data privacy and content modification capabilities. It allows for the masking of sensitive information, transforming the plugin into a more powerful tool for data governance. The performance optimization for regex handling ensures efficient processing, especially with large inputs.

📁 Total Files Changed

  • plugins/default/regexMatch.ts: Implemented regex redaction logic, optimized regex object creation and reuse, and enhanced null/undefined checks for text handling.

🧪 Test Added

N/A

🔒Security Vulnerabilities

The regexPattern parameter is directly used to create a RegExp object. This introduces a potential Regular Expression Denial of Service (ReDoS) vulnerability if a malicious or inefficient regex pattern is provided by the user, which could lead to excessive processing time and resource consumption.

Motivation

To enhance data privacy and allow for masking of sensitive information identified by regex patterns.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Testing

Screenshots (if applicable)

N/A

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Related Issues

N/A

Tip

Quality Recommendations

  1. Implement robust input validation for the regexPattern parameter to prevent Regular Expression Denial of Service (ReDoS) attacks. Consider using a regex validation library or setting a timeout for regex execution.

  2. Add comprehensive unit tests specifically for the new redaction feature, covering various regex patterns, redactText values, and edge cases (e.g., no matches, multiple matches, empty input text).

Tanka Poem ♫

Regex now can hide,
Sensitive data concealed,
Privacy takes flight.
Patterns match, then transform,
Science shields, with careful hand. 🔬✨

Sequence Diagram

sequenceDiagram
    participant PH as PluginHandler
    participant C as Context
    participant ET as EventType
    participant T as Text

    PH->>C: getCurrentContentPart(context, eventType)
    C-->>PH: {content, textArray}

    alt Missing regexPattern or text
        PH-->>PH: throw Error('Missing regex pattern')
    else
        PH->>PH: Create RegExp(regexPattern, redact ? 'g' : '')
        loop For each text in textArray
            PH->>PH: Reset regex.lastIndex = 0 (if redact)
            PH->>T: text.match(regex)
            T-->>PH: matches
            alt Matches found AND redact is true
                PH->>PH: Reset regex.lastIndex = 0
                PH->>T: text.replace(regex, redactText)
                T-->>PH: redactedText
                PH->>PH: Add redactedText to mappedTextArray
            else No matches or no redaction
                PH->>PH: Add null to mappedTextArray
            end
        end

        PH->>C: getText(context, eventType)
        C-->>PH: textToMatch

        alt Redact is true
            PH->>PH: Reset regex.lastIndex = 0
        end
        PH->>T: regex.exec(textToMatch)
        T-->>PH: singleMatch

        PH-->>PH: Prepare data object (regexPattern, hasMatches, singleMatch, textExcerpt)
    end

    alt Error Occurred
        PH-->>PH: Catch error (e)
        PH->>C: getText(context, eventType)
        C-->>PH: textExcerpt
        PH-->>PH: Prepare error data object (explanation, regexPattern, not, textExcerpt)
    end
Loading

Copy link

@matter-code-review matter-code-review bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added redaction feature to regex plugin with proper text processing and backward compatibility. Found several logic and performance issues that need attention.

Skipped files
  • plugins/default/manifest.json: Skipped file pattern

Comment on lines 47 to 61
const regex = new RegExp(regexPattern, redact ? 'g' : '');

// Check for matches across all text
let hasMatches = false;
const mappedTextArray: Array<string | null> = [];

textArray.forEach((text) => {
if (!text) {
mappedTextArray.push(null);
return;
}

// Check if pattern exists in text
const localRegex = new RegExp(regexPattern, 'g');
const matches = text.match(localRegex);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Bug Fix

Issue: Creating redundant RegExp objects - regex is created but never used, while localRegex is recreated in every loop iteration
Fix: Reuse the main regex object and avoid unnecessary object creation
Impact: Reduces memory allocation and improves performance in loops

Suggested change
const regex = new RegExp(regexPattern, redact ? 'g' : '');
// Check for matches across all text
let hasMatches = false;
const mappedTextArray: Array<string | null> = [];
textArray.forEach((text) => {
if (!text) {
mappedTextArray.push(null);
return;
}
// Check if pattern exists in text
const localRegex = new RegExp(regexPattern, 'g');
const matches = text.match(localRegex);
const regex = new RegExp(regexPattern, redact ? 'g' : '');
// Check for matches across all text
let hasMatches = false;
const mappedTextArray: Array<string | null> = [];
textArray.forEach((text) => {
if (!text) {
mappedTextArray.push(null);
return;
}
// Check if pattern exists in text
const matches = text.match(regex);

Comment on lines 66 to 68
if (redact) {
const redactedText = text.replace(localRegex, redactText);
mappedTextArray.push(redactedText);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Bug Fix

Issue: Using localRegex variable that no longer exists after the previous fix
Fix: Use the main regex object for replacement
Impact: Fixes reference error and maintains consistent regex behavior

Suggested change
if (redact) {
const redactedText = text.replace(localRegex, redactText);
mappedTextArray.push(redactedText);
if (redact) {
const redactedText = text.replace(regex, redactText);
mappedTextArray.push(redactedText);
} else {

transformed = true;
}

verdict = not ? hasMatches : !shouldBlock;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Bug Fix

Issue: Logic error in verdict calculation - when not=true and hasMatches=true, should return false (blocked), but !shouldBlock returns true
Fix: Correct the boolean logic for the not parameter
Impact: Fixes inverted logic that would allow matches when they should be blocked

Suggested change
verdict = not ? hasMatches : !shouldBlock;
verdict = not ? !hasMatches : !shouldBlock;

Comment on lines 92 to 94
// For backward compatibility, also get single text for matchDetails
const textToMatch = getText(context, eventType);
const singleMatch = new RegExp(regexPattern).exec(textToMatch);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️ Performance Improvement

Issue: Creating new RegExp object for single match when regex already exists
Fix: Reuse existing regex object with reset lastIndex for global patterns
Impact: Eliminates unnecessary RegExp construction

Suggested change
// For backward compatibility, also get single text for matchDetails
const textToMatch = getText(context, eventType);
const singleMatch = new RegExp(regexPattern).exec(textToMatch);
// For backward compatibility, also get single text for matchDetails
const textToMatch = getText(context, eventType);
regex.lastIndex = 0; // Reset for global regex
const singleMatch = regex.exec(textToMatch);

Comment on lines 117 to 120
textExcerpt:
textToMatch.length > 100
textToMatch?.length > 100
? textToMatch.slice(0, 100) + '...'
: textToMatch,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Bug Fix

Issue: Potential null pointer exception when textToMatch is null/undefined
Fix: Add null check before accessing length property
Impact: Prevents runtime crash when text content is missing

Suggested change
textExcerpt:
textToMatch.length > 100
textToMatch?.length > 100
? textToMatch.slice(0, 100) + '...'
: textToMatch,
textExcerpt:
textToMatch && textToMatch.length > 100
? textToMatch.slice(0, 100) + '...'
: textToMatch,

Copy link

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

  • /matter summary: Generate AI Summary for the PR
  • /matter review: Generate AI Reviews for the latest commit in the PR
  • /matter review-full: Generate AI Reviews for the complete PR
  • /matter release-notes: Generate AI release-notes for the PR
  • /matter : Chat with your PR with Matter AI Agent
  • /matter remember : Generate AI memories for the PR
  • /matter explain: Get an explanation of the PR
  • /matter help: Show the list of available commands and documentation
  • Need help? Join our Discord server: https://discord.gg/fJU5DvanU3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] add redact functionality to the regex plugin
1 participant