-
Notifications
You must be signed in to change notification settings - Fork 83
fix(clp-mcp-server): Refine system prompt to make LLMs' KQL query generation more CLP friendly. #1494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(clp-mcp-server): Refine system prompt to make LLMs' KQL query generation more CLP friendly. #1494
Changes from all commits
f5e8e30
df3e46e
5c88582
a4b8753
baa83d2
52b9b87
2f792c9
e57098e
a2c493b
2cbbc9a
384ffb7
55d4d3b
2f5f088
5623d8d
e97ee59
7cc1991
e332bc5
d0018c6
13f179b
6cefc21
9f07c42
ec37ab8
05490e1
c88c35e
60d28c0
33fafb7
b902444
aa486bf
f16161f
a142092
a43883b
a008104
56e34ce
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -10,24 +10,51 @@ | |||||||||
|
|
||||||||||
| SERVER_NAME = "clp-mcp-server" | ||||||||||
|
|
||||||||||
| # System prompts should be LLM-friendly; while LLMs may not strictly enforce all rules, we | ||||||||||
| # empirically found the following practices effective for LLMs to understand the listed rules: | ||||||||||
| # | ||||||||||
| # 1. Provide concrete examples to explain the rule. | ||||||||||
| # 2. Place critical rules at the beginning and mark them as "CRITICAL". | ||||||||||
| # 3. Use action-first sentence structure (e.g., "Use X to do Y" instead of "To do Y, use X"). | ||||||||||
| # 4. Specify any behaviour that the agent needs to perform (like formatting hyperlinks) very early | ||||||||||
| # in the prompt. | ||||||||||
| # 5. Use terse language and bullet points; avoid complex sentence structures. LLMs can fill in the | ||||||||||
| # gaps. | ||||||||||
| # 6. Let some instructions and details be implicit to avoid overwhelming the LLM. | ||||||||||
| # 7. Use the same example across different rules to maintain consistency. | ||||||||||
| # 8. Don't wrap text since the extra line breaks may influence the LLM's understanding. | ||||||||||
| # fmt: off | ||||||||||
| SYSTEM_PROMPT = ( | ||||||||||
| "You are an AI assistant that helps users query a log database using KQL (Kibana Query Language)." | ||||||||||
| " You should generate a KQL query that accurately expresses the user's intent. The generated KQL" | ||||||||||
| " query should be as specific as possible to minimize the number of log messages returned. When " | ||||||||||
| "displaying log messages, wrap them in hyperlinks with the `link` field from the search result.\n\n" | ||||||||||
| "You should consider the following guidelines to generate KQL queries efficiently:\n" | ||||||||||
| "- Use specific field names and values to narrow down the search.\n" | ||||||||||
| "- Avoid using wildcards (`*`) unless absolutely necessary, as they can lead to large result" | ||||||||||
| " sets.\n" | ||||||||||
| "- Use logical operators (`AND`, `OR`, `NOT`) to combine one or more key-value searches.\n" | ||||||||||
| "- Consider specifying a time range to narrow down the search. Use" | ||||||||||
| " `search_by_kql_with_timestamp_range` with your KQL query and explicit start and end timestamps." | ||||||||||
| " Timestamps must follow the ISO 8601 UTC format (`YYYY-MM-DDTHH:mm:ss.fffZ`), where the trailing" | ||||||||||
| " `Z` indicates UTC.\n" | ||||||||||
| "- If the user query is ambiguous or lacks detail, ask clarifying questions to better understand" | ||||||||||
| " their intent before generating the KQL query.\n" | ||||||||||
| "- Always ensure that the generated KQL query is syntactically correct and can be executed without" | ||||||||||
| " errors." | ||||||||||
| "You are an AI assistant for querying the CLP log database using CLP-KQL (CKQL). Your job is to" | ||||||||||
| " generate CKQL that faithfully expresses the user's intent and show key logs to the user:\n" | ||||||||||
| "- Start broad to learn the schema/fields using wildcard searches like *, then narrow the query to" | ||||||||||
| " return a manageable result set.\n" | ||||||||||
| "- When showing log messages or when the user wants to see log messages, provide the hyperlink from" | ||||||||||
| " the result's link field.\n" | ||||||||||
| "\n" | ||||||||||
| "CKQL rules (read carefully; items marked CRITICAL will fail if violated):\n" | ||||||||||
| "- CRITICAL -- Substrings: use wildcards for partial matches -- * (any sequence), ? (single" | ||||||||||
| " character).\n" | ||||||||||
| " Example:\n" | ||||||||||
| " request: *GET*\n" | ||||||||||
| "\n" | ||||||||||
| "- Combining conditions: use AND / OR (case-insensitive).\n" | ||||||||||
| " Example:\n" | ||||||||||
| " request: GET AND response: 400\n" | ||||||||||
| "\n" | ||||||||||
| "- CRITICAL -- Multi-word text must be quoted: wrap multi-word searches in double quotes.\n" | ||||||||||
| " Example:\n" | ||||||||||
| ' request: "*GET wp-admin*"\n' | ||||||||||
| " (quotes and wildcards are required).\n" | ||||||||||
| "\n" | ||||||||||
| "- Escaping characters:\n" | ||||||||||
| ' - In keys, use backslash to escape searching for any of the literal characters: \\, ", ., *,' | ||||||||||
| " @, $, !, #.\n" | ||||||||||
| ' - In values, use backslash to escape searching for any of the literal characters: \\, ", ?,' | ||||||||||
| " *.\n" | ||||||||||
| "- Time range: use search_by_kql_with_timestamp_range to constrain by time.\n" | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing critical timestamp format specification. According to the PR discussion, you explicitly stated that timestamp format is critical: "query with timestamp that does not follows ISO 8601 in UTC format will be rejected, so we want to make sure AI generates the exact format." You proposed including: "Timestamps must follow the ISO 8601 UTC format ( This specification has been flagged as a major issue in multiple past reviews and was marked as addressed in previous commits, but it is missing from the current code. Without this specification, LLMs may generate timestamps in incorrect formats (e.g., Unix timestamps, locale-specific formats), causing query failures and undermining the PR's goal of achieving 100% KQL accuracy. Apply this diff to add the timestamp format specification: -"- Time range: use search_by_kql_with_timestamp_range to constrain by time.\n"
+"- Time range: use search_by_kql_with_timestamp_range to constrain by time. Timestamps must"
+" follow ISO 8601 UTC format: YYYY-MM-DDTHH:mm:ss.fffZ (trailing Z indicates UTC). Example:"
+" 2024-10-15T08:30:00.000Z\n"📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||
| "\n" | ||||||||||
| "- Unsupported: no fuzzy matches; no less/greater-than comparisons on strings, IPs, or timestamps." | ||||||||||
| "\n" | ||||||||||
| ) | ||||||||||
| # fmt: on | ||||||||||
Uh oh!
There was an error while loading. Please reload this page.