Block operational commands in AI chat#80
Conversation
Shell-style prompts such as cd ../ can retrieve unrelated RAG chunks and reach the local model. Treat command execution, filesystem traversal, and server probing patterns as security-boundary requests before search or LLM invocation, while preserving FairPlay-domain fallback answers. Constraint: no-context FairPlay help should remain available Rejected: refuse every question without RAG context | too conservative for normal product guidance Confidence: high Scope-risk: narrow Directive: Keep command and server-probing inputs out of vector search so false-positive chunks cannot launder them into LLM prompts Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagChatServiceScopeTest --no-daemon Tested: git diff --check Co-authored-by: OmX <omx@oh-my-codex.dev>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces operational command detection to prevent command injection and path traversal attacks in the RAG chat service, along with a corresponding unit test. The review feedback highlights critical issues with the implementation: the regular expression used to detect commands has a high false-positive rate for common English words (such as 'free', 'find', 'top', 'less') which are highly relevant to an event platform, and compiling the regex on every call via String.matches() introduces performance overhead. Additionally, the check for 'idrsa' should be expanded to include 'id_rsa' to properly detect SSH private key exposure attempts.
| return true; | ||
| } | ||
|
|
||
| return normalized.matches("^(cd|ls|ll|pwd|cat|env|printenv|whoami|id|uname|hostname|ps|top|htop|free|df|du|docker|kubectl|curl|wget|ssh|scp|sudo|su|rm|cp|mv|chmod|chown|find|grep|tail|head|less|vi|vim|nano)(\\s+.*)?$"); |
There was a problem hiding this comment.
There are two major issues with this regular expression check:
- High False-Positive Rate: The regex matches any query starting with common English words like
free,find,top,less,cat, orid(e.g.,"free tickets","find events near me","top 10 events","less than 10000 won"). Since this is an event and ticketing platform, these are highly legitimate user queries that will be incorrectly blocked. - Performance Overhead:
String.matches()compiles the regular expression pattern on every single invocation, which introduces significant CPU overhead under high traffic.
Recommendations:
- Remove highly common English words (
free,find,top,less,cat,id,env) from the prefix matching list, or only block them if they are accompanied by typical shell arguments/flags (e.g.,free -m,find .). - Precompile the regular expression as a
private static final Patternat the class level (e.g.,OPERATIONAL_COMMAND_PATTERN).
| return normalized.matches("^(cd|ls|ll|pwd|cat|env|printenv|whoami|id|uname|hostname|ps|top|htop|free|df|du|docker|kubectl|curl|wget|ssh|scp|sudo|su|rm|cp|mv|chmod|chown|find|grep|tail|head|less|vi|vim|nano)(\\s+.*)?$"); | |
| return OPERATIONAL_COMMAND_PATTERN.matcher(normalized).matches(); |
| compact.contains("/proc/") || | ||
| compact.contains("/var/log") || | ||
| compact.contains("~/.ssh") || | ||
| compact.contains("idrsa") || |
There was a problem hiding this comment.
The check compact.contains("idrsa") will not match the standard SSH private key filename id_rsa because compact is created by removing spaces (normalized.replace(" ", "")), which preserves underscores. To ensure robust blocking of private key exposure attempts, please add a check for id_rsa as well.
compact.contains("id_rsa") ||
compact.contains("idrsa") ||
Blocks shell-style and server-probing prompts before vector search, so false-positive RAG chunks cannot route operational commands into the local LLM.
Included:
Tested: ./gradlew test --tests com.fairing.fairplay.ai.rag.service.RagChatServiceScopeTest --no-daemon.
Tested: git diff --check.