User Story
As a security-conscious developer,
I want to sanitize user-controlled inputs in prompt formatting
so that malicious XML tags can't disrupt LLM response parsing.
Background
The current user_prompt.format() in eknowledge/main.py directly inserts raw text into XML-structured LLM prompts. This allows injection of fake <node> entries through inputs containing XML syntax (e.g., "<node><from_node>HACK</from_node>"). The vulnerability exists in:
# main.py line 92:
HumanMessage(content=user_prompt.format(text=chunk, relationships=relations))
Attackers could manipulate knowledge graph outputs by poisoning text inputs with XML tags, potentially creating虚假 relationships or disrupting parsing logic.
Acceptance Criteria
User Story
As a security-conscious developer,
I want to sanitize user-controlled inputs in prompt formatting
so that malicious XML tags can't disrupt LLM response parsing.
Background
The current
user_prompt.format()ineknowledge/main.pydirectly inserts raw text into XML-structured LLM prompts. This allows injection of fake<node>entries through inputs containing XML syntax (e.g.,"<node><from_node>HACK</from_node>"). The vulnerability exists in:Attackers could manipulate knowledge graph outputs by poisoning text inputs with XML tags, potentially creating虚假 relationships or disrupting parsing logic.
Acceptance Criteria
execute_graph_generationineknowledge/main.pyto sanitize text inputs<,>,&) with entities (<,>,&) before string formattingtests/test_eknowledge.pythat verifies:<node>TEST</node>get converted to<node>TEST</node>in prompts