Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions src/plugins/universal-guard/pii-scrubber.js
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ export class PiiScrubber {
* @param {boolean} cfg.logDetections - Whether to log PII detections
*/
constructor(cfg) {
this._cfg = cfg;
this._cfg = this._normalizeConfig(cfg);
this._detections = 0;
this._byType = {};
}
Comment on lines 113 to 117
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: High

The constructor now delegates to _normalizeConfig. This is a positive change that centralizes validation logic. However, there is a potential state inconsistency risk on re-initialization. The _detections and _byType counters are reset to zero in the constructor, but they are not reset when updateConfig is called. This is correct behavior, as the detection statistics are about the scrubber's runtime activity, not its configuration. The PR does not change this, but it's worth noting that if the scrubber were ever re-instantiated with a new config (instead of using updateConfig), the stats would reset. This is fine, but it highlights that updateConfig is for dynamic reconfiguration, while creating a new instance is for a fresh state. The architecture is sound.

Expand Down Expand Up @@ -184,6 +184,19 @@ export class PiiScrubber {
}

updateConfig(cfg) {
this._cfg = cfg;
this._cfg = this._normalizeConfig(cfg);
}

_normalizeConfig(cfg) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: High

The new _normalizeConfig method introduces a robust, defensive pattern for configuration handling, which is architecturally sound. However, it creates a subtle but significant behavioral shift: the method now silently discards any configuration properties not explicitly listed (enabled, action, patterns, logDetections). While the PR description focuses on preventing runtime errors from malformed patterns, this change also means that any future extension of the configuration schema (e.g., adding a severity level or a custom replacementText) would be silently ignored by the scrubber after this change. The method spreads next but then overwrites only the four known keys, effectively filtering out any extras. This is a breaking change for any existing or future code that might attach metadata to the config object expecting it to be preserved through updateConfig. The safer pattern is to merge the normalized values into the provided config, preserving unknown properties, or to explicitly validate and reject configs with unknown properties.

Code Suggestion:

_normalizeConfig(cfg) {
    const next = (cfg && typeof cfg === 'object') ? { ...cfg } : {};
    // Normalize only the known properties, preserving any others.
    return {
        ...next,
        enabled:       Boolean(next.enabled),
        action:        next.action === 'flag' ? 'flag' : 'redact',
        patterns:      Array.isArray(next.patterns)
            ? next.patterns.filter(p => typeof p === 'string' && p.trim().length > 0)
            : [],
        logDetections: Boolean(next.logDetections),
    };
}

const next = (cfg && typeof cfg === 'object') ? { ...cfg } : {};
return {
...next,
enabled: Boolean(next.enabled),
action: next.action === 'flag' ? 'flag' : 'redact',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: High

The logic for the action property uses a strict equality check (=== 'flag') and defaults everything else to 'redact'. This is simple and safe. However, it lacks case-insensitivity and does not provide feedback for invalid values. If a UI sends 'FLAG' or 'log', it will silently be treated as 'redact'. While not a security issue (defaulting to the more conservative redact action), it is a potential source of confusion for operators. Consider adding a warning for unrecognized actions or normalizing the input (e.g., next.action?.toLowerCase()).

Code Suggestion:

const normalizedAction = String(next.action).toLowerCase();
action: normalizedAction === 'flag' ? 'flag' : 'redact',
// Optional: Log if original value was not 'redact' or 'flag'
if (normalizedAction !== 'redact' && normalizedAction !== 'flag') {
    console.warn(`PiiScrubber: Unrecognized action "${next.action}". Defaulting to "redact".`);
}

patterns: Array.isArray(next.patterns)
? next.patterns.filter(p => typeof p === 'string' && p.trim().length > 0)
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patterns normalization filters out whitespace-only strings using p.trim().length > 0 but keeps the original (untrimmed) values. This means a config like patterns: [' email '] will be accepted but will never match PATTERNS[name] in scrub(), silently disabling detection. Consider trimming (and optionally de-duplicating) pattern names during normalization so the stored config contains canonical keys.

Suggested change
? next.patterns.filter(p => typeof p === 'string' && p.trim().length > 0)
? [...new Set(
next.patterns
.filter(p => typeof p === 'string')
.map(p => p.trim())
.filter(p => p.length > 0)
)]

Copilot uses AI. Check for mistakes.
: [],
Comment on lines +196 to +198
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: High

The normalization logic for patterns silently degrades invalid input (non-array, or array containing non-strings/empty strings) to an empty array []. This aligns with the PR's goal of making the scrubber non-fatal. However, this creates a silent failure mode where a configuration error (e.g., a typo like paterns) or a UI/API bug that sends a malformed payload results in the PII scrubber being completely disabled without any warning or log. For a security-critical component, silently turning off protection is risky. The scrubber should at least log a warning when patterns is normalized to an empty array, especially if the scrubber is enabled. The logDetections flag is for logging actual PII findings, not configuration errors.

Code Suggestion:

patterns:      Array.isArray(next.patterns)
    ? next.patterns.filter(p => typeof p === 'string' && p.trim().length > 0)
    : (() => {
        if (next.enabled && next.patterns !== undefined) {
            console.warn(`PiiScrubber: Invalid 'patterns' config (${typeof next.patterns}). Defaulting to empty list.`);
        }
        return [];
      })(),

logDetections: Boolean(next.logDetections),
};
}
}
40 changes: 40 additions & 0 deletions tests/universal-guard-pii-scrubber.test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import { describe, expect, test } from '@jest/globals';
import { PiiScrubber } from '../src/plugins/universal-guard/pii-scrubber.js';

describe('PiiScrubber', () => {
test('handles null patterns config without throwing', () => {
const scrubber = new PiiScrubber({
enabled: true,
action: 'redact',
patterns: null,
logDetections: false,
});

expect(() => scrubber.scrub([{ role: 'user', content: 'email me at x@y.com' }])).not.toThrow();
expect(scrubber.scrub([{ role: 'user', content: 'email me at x@y.com' }]).detections).toEqual([]);
});

test('continues detecting after updateConfig receives invalid patterns', () => {
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name says it "continues detecting" after updateConfig receives invalid patterns, but the assertions expect detections to stop and the message to remain unchanged. Rename the test to match the behavior being validated (e.g., that invalid patterns degrades safely to no detection).

Suggested change
test('continues detecting after updateConfig receives invalid patterns', () => {
test('degrades safely to no detection after updateConfig receives invalid patterns', () => {

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 | Confidence: High

The new unit tests are valuable for locking in the non-throwing behavior. However, they do not test the updateConfig method in isolation. The test 'continues detecting after updateConfig receives invalid patterns' validates the outcome of a subsequent scrub call but does not assert that the updateConfig call itself succeeds without throwing. While the implementation makes this likely, a direct test that scrubber.updateConfig(invalidPayload) does not throw would be more precise and would catch regressions where _normalizeConfig might be removed or altered. Furthermore, the test suite lacks coverage for updateConfig with a null or undefined argument, which the _normalizeConfig method is designed to handle (it defaults to an empty object).

Code Suggestion:

test('updateConfig does not throw on invalid input', () => {
    const scrubber = new PiiScrubber({ enabled: true, patterns: [] });
    expect(() => scrubber.updateConfig({ patterns: 'not-an-array' })).not.toThrow();
    expect(() => scrubber.updateConfig(null)).not.toThrow();
    expect(() => scrubber.updateConfig(undefined)).not.toThrow();
});

const scrubber = new PiiScrubber({
enabled: true,
action: 'redact',
patterns: ['email'],
logDetections: false,
});

const before = scrubber.scrub([{ role: 'user', content: 'x@y.com' }]);
expect(before.messages[0].content).toBe('[REDACTED_EMAIL]');
expect(before.detections).toEqual([{ type: 'email', count: 1 }]);

scrubber.updateConfig({
enabled: true,
action: 'redact',
patterns: 'email',
logDetections: false,
});

const after = scrubber.scrub([{ role: 'user', content: 'x@y.com' }]);
expect(after.detections).toEqual([]);
expect(after.messages[0].content).toBe('x@y.com');
});
});