Skip to content

fix: deduplicate relation edges using MERGE instead of CREATE#33

Open
back2zion wants to merge 1 commit intonikmcfly:mainfrom
back2zion:fix/deduplicate-relation-edges
Open

fix: deduplicate relation edges using MERGE instead of CREATE#33
back2zion wants to merge 1 commit intonikmcfly:mainfrom
back2zion:fix/deduplicate-relation-edges

Conversation

@back2zion
Copy link
Copy Markdown

Summary

  • Entity nodes use MERGE to prevent duplicates, but relation edges use CREATE — causing duplicate edges when the same (source, relation_type, target) appears across multiple text chunks
  • Example: "person A works_for company B" mentioned in 4 chunks → 4 identical edges created

Fix

Changed CREATE to MERGE keyed on (graph_id, name) between the same source/target nodes:

  • ON CREATE: initializes all properties as before
  • ON MATCH: updates fact/embedding and appends episode_id to the list

Before

Multiple chunks mentioning the same relationship → duplicate edges in graph

After

Same (source, relation, target) → single edge, enriched with all episode references

Test plan

  • Verified single edge created for repeated "works_for" relationships across chunks
  • Confirmed episode_ids accumulate on matched edges

When the same (source, relation_type, target) pair appears in multiple
text chunks, the graph builder creates duplicate edges because it uses
CREATE unconditionally. Entity nodes already use MERGE for dedup, but
relations did not.

Changed CREATE to MERGE keyed on (graph_id, name) between the same
source and target nodes. On duplicate, the fact and embedding are
updated and the episode_id is appended to the list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant